Beyond Vectors: A Deep Dive into Modern Search in Qdrant

Author(s): Ashish Abraham Originally published on Towards AI. Image by Author Years back, I read a book called “I Thought I Knew How to Google”. It showed numerous ways in which we can write Google queries with operators like AND, NOT, quotation marks, and other tricks, to get relevant results for our keywords every time. It used to work well and works even now; but not just Google, the entire concept of search engines has changed over the years. When was the last time you actually typed something like “best smartphone AND battery life NOT iPhone” into a search box? Today, we mostly type natural-language questions, rely on autocomplete suggestions, or simply expect the search engine to understand our intent without us having to craft the query carefully. Search has quietly shifted from telling the machine exactly what to do to trusting the machine to figure out what we mean. Search Has Outgrown “Just Keywords” Search engine queries we use now increasingly resemble questions or loosely described intent rather than precise keywords or exact terms. For example, instead of typing “Section 80C tax deduction India,” users are more likely to ask, “How can I save tax in India this year?” At the same time, you can also search for something like “Something casual and green from Levi’s for winter”, where Levi’s is a brand name and should be processed exactly as it is. Such a shift in user behavior alongside advances in the AI landscape requires developers to consider a blend of approaches. Pure keyword matching struggles with intent-heavy, conversational queries, while pure semantic search can miss critical constraints like exact product names, IDs, dates, prices, or compliance terms. Modern search systems must combine the following: semantic similarity to understand meaning, exact match to respect precision, filters to narrow down results by category or metadata, and numeric context to interpret ranges, limits, and comparisons correctly. This is called hybrid search. The problem is not to build a search engine but to build one that resonates with how real people actually search. I experimented with a few options — hosted vector databases that focused mainly on semantic similarity, Elasticsearch setups for hybrid retrieval, and even custom pipelines where dense retrieval, keyword filtering, and reranking were handled by separate components arranged sequentially. Each could handle only part of the problem, and none felt comfortable once I moved beyond demo queries. Pure vector stores like ChromaDB delivered great semantic matches, but filters felt secondary. As soon as I added constraints like brand, price, or availability, I was spending more time tuning recall and latency than improving relevance. I also tried stitching multiple systems together. One for vectors, one for keywords, and custom logic to merge results. It worked, but it added operational and conceptual complexity to a simple search engine. Out of every approach, Qdrant stood out because hybrid search wasn’t something you assembled in it. It was native. Dense vectors, sparse vectors, full-text search, filters, and ranking all lived in one engine. That made it easier to reason about search as a single system, not a collection of workarounds. In this article, I’ll walk you through the different concepts I explored for building search engines for modern user queries. I’ll share what worked, what didn’t, and show how Qdrant’s features and trade-offs helped me create a search system that’s both smart and precise. Table Of Contents · Dense Vector Search: Semantic Backbone of Modern Retrieval ∘ Similarity Measures in Vector Search ∘ Implementation· Sparse Vector Search: Bringing Back Precision and Rare Terms ∘ Search Mechanics ∘ Implementation· Full-Text Indexing: Lightweight but Surprisingly Powerful ∘ What Happens When You Hit Search ∘ Implementation· ASCII-Folding: Making Multilingual Text Search Actually Work ∘ ASCII Folding in Hybrid Search ∘ Implementation· ACORN: Filter-Aware Vector Search ∘ How Does ACORN Work? ∘ Implementation· Leveling Up the Modern Search Stack ∘ Reranking ∘ Multilingual Tokenization ∘ Performance & Cost Improvements· Putting It All Together: Designing a Hybrid Retrieval Pipeline ∘ Set Up Qdrant ∘ Data Ingestion ∘ Payload Indexing ∘ Hybrid Search· Wrapping Up· References Dense Vector Search: Semantic Backbone of Modern Retrieval We do not need much of an introduction to embeddings or dense vectors, though they pretty much lay the foundation for modern AI. At a high level, they are numerical representations that capture the meaning of text, images, or almost any other data modality, allowing machines to compare and reason about them meaningfully. Text: “Running shoes for daily jogging” ↓ [0.12, 0.75, -0.33, …] ← Dense vector → For example, imagine embedding a few common queries and labels into a 2D space: Embeddings Illustration (Image By Author) “Running shoes for daily jogging” and “lightweight sneakers for morning runs” appear close together because both describe the same intent — comfortable footwear for regular running, even though the wording is different. “Wireless noise-cancelling headphones” and “Bluetooth headphones with ANC” cluster together since both refer to the same product category and feature set, expressed using different terminology. “iPhone 14 charging cable”, “Lightning cable for Apple phone”, and “Apple fast charger wire” form another cluster, grouped by accessory compatibility and charging intent. Even though the exact words differ, embeddings place related items near each other because they mean roughly the same thing. These embeddings are computed by specialized embedding models, each designed for particular tasks and available in a variety of sizes depending on performance and cost needs. Similarity Measures in Vector Search Before performing a vector search, the whole corpus of data that you need to search in is stored as embeddings and indexed in a vector database or vector store. Once you hit search, your query is converted into a vector by the same embedding model used to prepare the database. This query embedding is then compared against each embedding, and the closest data points are retrieved. Let’s understand how this comparison actually works in real-time. A common way to compare vectors is cosine similarity, calculated as the dot product of the vectors divided by the […]

Liked Liked