Vector Search
We support lightning fast vector search on massive scale data. Following performance data shows search latency from a 1M dataset with warmed up cache.
Percentile | Latency |
---|---|
P50 | 25ms |
P90 | 26ms |
P99 | 35ms |
Max | 49ms |
Other than latency, users can also tune the following parameters for better search quality.
- nprobes: the number of partitions to search (probe)
- refine factor: a multiplier to control how many additional rows are taken during the refine step
- distance range: search for vectors within the distance range
LanceDB delivers exceptional vector search performance with metadata filtering. Benchmark results demonstrate 65ms query latency at scale, tested on a 15-million vector dataset.This combination of fast vector search and precise metadata filtering enables efficient, accurate querying of large-scale datasets.
Vector search with metadata prefiltering
Vector search with metadata postfiltering
By default, pre-filtering is performed to filter prior to vector search. This can be useful to narrow down the search space of a very large dataset to reduce query latency. Post-filtering is also an option that performs the filter on the results returned by the vector search. You can use post-filtering as follows:
Batch query
LanceDB can process multiple similarity search requests simultaneously in a single operation, rather than handling each query individually.
When processing batch queries, the results include a query_index
field
to explicitly associate each result set with its corresponding query in
the input batch.
Other search options
Fast search
While vector indexing occurs asynchronously, newly added vectors are immediately
searchable through a fallback brute-force search mechanism. This ensures zero
latency between data insertion and searchability, though it may temporarily
increase query response times. To optimize for speed over completeness,
enable the fast_search
flag in your query to skip searching unindexed data.
Bypass Vector Index
The bypass vector index feature prioritizes search accuracy over query speed by performing an exhaustive search across all vectors. Instead of using the approximate nearest neighbor (ANN) index, it compares the query vector against every vector in the table directly.
While this approach increases query latency, especially with large datasets, it provides exact, ground-truth results. This is particularly useful when:
- Evaluating ANN index quality
- Calculating recall metrics to tune
nprobes
parameter - Verifying search accuracy for critical applications
- Benchmarking approximate vs exact search results