OpenSearch Tips for Rapid Vector Search and Search Prototyping

An abstract digital network graphic displaying interconnected nodes and overlapping geometric data layers in dark cyan, representing a vector search space.

Introduction

If you look for advice on how to set up OpenSearch, you’ll find lots of articles discussing how to handle large complex production deployments. If you’re using OpenSearch as a platform to test out novel ideas at a small scale, though, then much of that advice doesn’t apply to you. The rapid change and brief duration of proof-of-concept work creates different pressures than the large scale and redundancy requirements of production systems. Given the changing landscape of search and the increasing interest in new applications of vector search, I thought I’d share insight from the development of my own recent prototype of using OpenSearch to prototype a scene-oriented multimodal retrieval workflow. Hopefully this can help those who are either jumping into OpenSearch for the first time, or who have mostly interacted with it in large stable systems. It’s a different engineering challenge when things change by the hour instead of the week and when breaking something means a couple of hours debugging rather than a production outage.

Basic Architecture

For any OpenSearch project, one of the most important systems to set up is OpenSearch itself. The documentation does a good job showing what steps are necessary to get a basic default setup running. Once you start customizing things though, most of the documented advice is for production systems. For small-scale novel proof-of-concept systems, it’s generally best to set up as little of the overall system as possible to go faster. To accomplish that without skipping steps that will cause more headaches later, here are a few useful pieces to keep in mind when going through the initial setup.

Is security necessary

Security is an important part of setting up OpenSearch since it interacts with nearly every component. Correctly configuring it, however, is usually a pain. Depending on what level of project you are setting up, you may need to show that security features work or you may not need them at all. If the goal is to prove that a particular functionality works or test the quality of search results, it’s probably better to leave security off. If the goal is to prove at a small scale how to build a larger production system, you’d be better served to practice using the system with security in place. This choice is especially important because security will need to be present on almost all API calls in the system.

Docker vs non-Docker setup

If you go to the OpenSearch getting started page, one of the first decisions you will have to make is whether to use Docker or whether to install directly onto your machine. For larger deployments, choosing whether to use a Docker based setup has major implications for how operations and infrastructure function. For small-scale local environment setups, it’s usually much less important. The major benefits of Docker (repeatable setup, scaling, hardware independence) don’t really apply to quick-and-dirty local work. That said, use whatever tooling is most comfortable for you. You will probably revisit the architecture and setup if the project is successful and moves to a larger scale. For a proof of concept, what you do with it will usually be more important than what platform it’s running on.

Code integration

Much of the work of setting up a proof of concept involves modifying data and moving it around. This process will almost always involve writing code. Having a coding language you’re familiar with and a good knowledge base around making API calls and interacting with files will be essential. If you’re new to programming, you’ll need to spend some extra time getting comfortable with this process. Stick with it though, as it’s a valuable skill set.

Vector Embedding

If your project involves vector search, it’s important to think about how your queries will get embedded. Unlike index-time embeddings, they need to be calculated live between the system where the user enters their query and the OpenSearch fetching process. There are several common options for handling this.

OpenSearch Pipelines

OpenSearch has the ability to set up pipelines if you use one of the models they integrate with. This eliminates the need to set up a separate service, but provides a lot of constraints on how the model functions. For my project, I needed the ability to modify the query vectors before searching, so this option was too restricting.

External API Call

If you’re using a commercial service it may involve making API calls. This is the most straightforward option, but may still require additional layers to avoid Cross-Site Scripting issues (discussed in more detail in the User Interface section).

Local Model

If you’re running a local model you will need an easy way to call it. In general, Large Language Models (LLMs) have the most support in the Python language and I’d highly recommend using that ecosystem if possible. For my project, I needed to locally run OpenClip models to support image vector embedding for text-to-image search. This was accomplished using Python libraries from HuggingFace to run the model and Uvicorn to provide the server backbone.

OpenSearch Setup and Configuration

Getting OpenSearch to run for the first time is pretty straightforward and well covered in the documentation. Getting it to do what you need for a unique project can be much more complicated. When that project is a small scale proof-of-concept, it can be difficult to parse what is best practice for general setup vs best practice for large production systems. Knowing what you need and what you don’t can let you reduce the time spent on general setup and put it towards your unique goals instead. Below are some helpful pointers for setting up a system that meets your needs without spending unnecessary time or effort.

Use the Defaults

OpenSearch uses a lot of default settings so that it can start up with little configuration. In production systems it’s common to need to change a number of these. For small local projects, it’s usually much better to leave the defaults as-is. Any changes will need to be carried through to every system they affect. For experts this is usually a minor inconvenience, but for non-experts it adds an additional burden of translation from what the documentation recommends to what works for your system.

Save your API calls

Configuring indexes and other parameters in OpenSearch involves making a lot of API calls. A lot of these calls only need to be made once, and it may be tempting to just send it off and be done (especially if you’ve been debugging unexpected issues for a while). I highly recommend resisting the temptation to do this. Having a record of the configuration API calls will be essential if the project moves beyond a proof-of-concept, as well as if you have to change anything in your setup later. You’ll have a much easier time recreating and adjusting the setup if you have a record of what you tried and what ultimately worked. One of OpenSearch's strengths is that nearly every configuration is expressed through APIs. Provided you are keeping track as you build, this makes it relatively easy to recreate environments and track changes as a project evolves.

Handling data

Due to their mercurial nature, small scale experimental projects have a lot more data volatility than large scale production ones. It’s very common to change your data format several times and repopulate indexes even more. That means it’s important to balance speed with repeatability and understandability to build a successful project.

Create local copies

While it may seem easier to mirror a production pipeline and move data directly from source into your OpenSearch environment, there are a lot of benefits to creating a local copy. The first is that it reduces load on the source system. Proof-of-concept systems involve frequent changes, and if data must be fully reloaded from source each time that can create more load than those systems typically handle. Second, having a static set of data makes it easier to determine if odd results are due to misconfiguration or data anomalies. Seeing the same data consistently makes it easier to compare between different setups, especially if they’re set up at different times. Finally, operating entirely locally removes the dependency and time of network calls. This helps with speed, network load, and reliability since a dropped connection won’t cause a job to fail.

Start small

Search data sets, in general, tend to be large. Depending on what you’re working with it might take a few minutes or many hours to push a full set of data into an OpenSearch index. That’s a lot of time to wait to find out your configuration isn’t right and you need to do it again. It’s worth taking the time to section off a smaller example data set to speed up the process of testing indexing and other integrations. Save indexing the entire data set for when you actually need to test the whole thing.

This principle also applies to using LLMs. Embedding an entire dataset is an expensive and time consuming operation. Working with small sets and/or small models will speed up the process and let you explore more of your setup before you have to commit to long batch jobs. Whether you’re using a local setup or a paid embedding API, it can be especially important to save the generated vector embeddings locally. Re-using the same vector embeddings saves cost and can speed up iteration time when testing different index setups.

Create New Indexes

While OpenSearch does allow some modification of index settings and field mappings, it doesn’t allow modifying the field type of any field that contains data. In most cases it is much easier to create a new index rather than attempting to modify the existing one. Especially if you are using subsets of the full data set and indexing from local storage. Using OpenSearch’s composable index template system can make this process easier by separating out components, though simply saving the index creation API calls will go a long way. If tracking previous work is important, make sure to use some form of version control. That way you can see how the index was set up at each stage of the project and not just the final version. Creating a new index also has the benefit of retaining the old one in case the latest change you’re trying out doesn’t work as you’d hoped and the previous version was better.

Tuning and Evaluating Search Results

For most projects, getting well tuned search results will either be the goal itself or an essential step to prove some other functionality. Since search tuning is a complex and difficult task, I’m going to focus my advice towards those who need “good enough” search relevancy. For those trying to provide great search relevancy, welcome to the club.

Lexical Tuning

Tuning Lexical search is a complicated task, but knowing how to look at results can help jump start the process. What you need to change largely comes down to one question: are documents not in the results or are they just too far from the top?

Documents not found

If documents aren’t coming back, you probably need to change how fields are analyzed. Changing stemming, searching additional fields, modifying which terms are required vs optional, and configuring synonyms can all help put the right documents into the result set. These changes tend to be the most search-centric and technical, so be prepared to spend some effort and learn some new techniques to get documents coming back as they should.

Documents not at the top

If documents are coming back but the order is poor, you’ll need to adjust how documents are scored. Changing field weights, boosting phrase matches, changing field similarity calculation, and boosting based on relevant factors (like date) can all help improve search results. Knowing which part to change and how much isn’t easy to determine. OpenSearch provides the full scoring information, but it can be difficult to understand without knowing the process. Overall, try to use reasonable values and pay attention to what helps and exactly what a change does.

If all of this sounds overwhelming, you’re not alone. Remember that Rome wasn’t built in a day and your first attempt at lexical search tuning won’t fix all of your problems. Even well tuned lexical systems have their quirks. Start small, test often, and remember that the first changes usually make the biggest difference.

kNN Vector Search

One of the major benefits of using kNN vector search is that it has much better out-of-the-box performance than lexical search which can help with creating quick prototypes. One of the downsides is that it has far fewer tuning options. If your setup isn’t giving good enough results, the only real option is to change which model you’re using. There are a few ways of doing this:

Upgrade

The easiest option is to simply switch to a more powerful model. Generally a model with more parameters or one that was trained on more similar data will give better results. One of the benefits of a proof-of-concept project with static data is that the documents only need to be embedded once to test them, reducing ongoing costs.

Re-Rank

The second option is to employ another model as a re-ranker. This setup involves rescoring the top N documents based on a direct LLM similarity score between the document and the query. This can give a much better ordering of documents, but there are some significant trade-offs to this approach. Since the score is calculated from both the query and the document, it can’t be pre-computed. That makes it more expensive to run queries, especially if the chosen result set size is large. Additionally, if a document isn’t in the top N results, there’s nothing the re-ranker can do to improve its score.

Model Tuning

Finally, it’s possible to tune models with example data from your specific use case. This is a highly technical operation and may require a significant amount of compute power depending on the model. While potentially the most effective option, it’s also the most difficult and expensive so pursue it with care.

Hybrid Search

When it works well, Hybrid search approaches can significantly improve results by scoring items with many matching criteria higher and including items that would normally fall through the cracks of one particular approach. Understanding what benefit you will get from your use case requires checking a couple important criteria: how often do documents show up in multiple search responses, and how comparable are the underlying scores?

Separate result sets and incomparable scores

If each sub-search has largely separate results and your scores aren’t easily comparable (common for Hybrid search combining Lexical and Vector search results) then the hybrid results will mostly look like a mix of the two. That typically means a higher likelihood of getting the best result, but also of getting odd results.

Separate results with comparable scores

When scores are comparable (common when combining different types of vector search like text-based and image vector search) it’s easier for poor results to filter to the bottom. Expect results to still look like a mix of the two searches, but rather than mostly alternating results you’ll see high scoring documents before low scoring documents. This probably won’t eliminate weird matches, but should help push them lower in the results.

Overlapping results

With overlapping results, Hybrid search can push results from only one underlying search down in favor of those with multiple matches. This tends to result in more consistently good results, but can decrease the importance of highly relevant results with only one source. As an example, a search combining text-based and image-based vector search will often show documents with pretty good image and text results before one that has the perfect image but a terrible description. Whether that’s good or bad depends on your use case.

It’s important to remember that Hybrid Search is a tool and not a magic bullet. Thinking through what results you’re getting and how they’ll be combined can help you decide whether it’ll make your results better or just make your code more complicated.

User Interfaces

As a backend engine, OpenSearch provides all the APIs necessary to build a search interface, but doesn’t provide a graphical interface itself. For all but the most technical projects, having a visual interface for users to interact with search is an essential part of proving that a project functions. Whether you’re changing things up or using a pretty standard setup with a search-text-box and a result list, poor display and interactions can break a demonstration even if the underlying technology is solid. Designing and building interfaces isn’t free though, and prototypes often require speed, so identifying what you need to build and how can make sure a project stays on track. Below are a few important things to keep in mind as you design the display for your project.

Identify what’s necessary

There are many different visual interfaces for search that cover different use-cases. Ecommerce often has large image displays multiple ways of searching and navigating. Legal document search usually has a lot of tools to control exactly how searches are formulated to ensure specific result sets. Before you start to create your own interface, think through which capabilities you need. Autocomplete, filters, image displays, scoring information, and pagination are all features that may or may not be important to your project. Figuring out what you need to support before you start building will help with building the right thing for the project. Remember, just because it’s common doesn’t mean you need it in your first build.

User Interface Hazards

There are many ways to create a user interface on top of search. Most commonly the interface will be a web page for easy access. Making quick web pages to demonstrate search does have a couple of hazards that can cause a lot of headaches.

Cross-Site Scripting

The first and most important hazard is called “Cross-Site Scripting”. It’s a protection built into modern browsers to prevent calls to outside domains. Practically speaking, this means your internal calls (say to OpenSearch APIs) have to match the same domain as the web page. This prevents otherwise easy options like pointing a browser to a local html file or statically serving up a page with a basic web server. If you want to display your search in a browser, you’ll need some kind of proxy server the browser can call at the same domain to get results from wherever you need them.

Self-Signed Certificates

The second major headache is managing security certificates for https. Self-signed certificates are extremely common for development environments, but they also cause errors in API calls if not handled correctly. Consider whether you need to use https rather than http and if you do, decide on how you’ll handle certificates.

Balancing Aesthetics and Functionality

Depending on your preferences, you may have a strong desire to either make an interface look fully professional, or to leave it bare-bones. As with many things, either extreme has significant downsides. Proof of concept work usually involves frequent changes to the structure of what data is displayed and how. Much of the work done to make a page look aesthetically pleasing will have to be re-done each time things change. That can create a lot of churn in an area of the project that isn’t what you’re really trying to demonstrate. On the other hand, extremely limited or poorly laid out interfaces will distract from the actual functionality and may make it much harder to catch actual errors. In general, leaving things in a “functional but ugly” state while building out capability can save time, but once capabilities are mostly set, plan on improving the interface until it’s at least not distracting from what you’re trying to demo. Expect things to change a bit as you build things out as not all user-interface considerations are obvious without seeing it in action. Search is usually an interactive process and you may find you use the interface differently when it’s closer to fully formed than you did when it was first cobbled together.

Conclusion

The world of search is changing a lot with new tools and new techniques being applied in new and unique scenarios. This requires a lot of prototype work to prove that a given technique works before it’s applied at a large scale. Working in a small scale, rapidly changing environment requires a different process than building out a large stable production system. Remember, even if you don’t get the answers you’d hoped for, finding out on a small local system is a huge savings compared to finding out on a fully built environment. For successful projects, iterating quickly will give you a chance to learn what works and what doesn’t so you can build a full scale system right the first time.

OpenSearch can provide an excellent platform for projects involving kNN vector search, LLMs, and other advanced processes. Knowing which parts of the system are essential and which aren’t necessary when working locally can save a lot of time and give your projects the best chance at success. I hope the advice here helps those exploring this exciting space to save time and energy creating the prototypes they need. Good luck and happy searching!