Reranking strategies
There are two common approaches for reranking search results from multiple sources.- Score-based: Calculate final relevance scores based on a weighted linear combination of individual search algorithm scores. Example: Weighted linear combination of semantic search & keyword-based search results.
- Relevance-based: Discards the existing scores and calculates the relevance of each search result-query pair. Example: Cross Encoder models
Example evaluation
The table below shows our evaluation results from an experiment comparing multiple rerankers on ~800 hybrid search queries. This is a modified version of an evaluation script by LlamaIndex that measures hit-rate @ top-k.Using OpenAI text-embedding-ada-002
Vector Search baseline: 0.64
| Reranker | Top-3 | Top-5 | Top-10 |
|---|---|---|---|
| Linear Combination | 0.73 | 0.74 | 0.85 |
| Cross Encoder | 0.71 | 0.70 | 0.77 |
| Cohere | 0.81 | 0.81 | 0.85 |
| ColBERT | 0.68 | 0.68 | 0.73 |
Using OpenAI text-embedding-3-small
Vector Search baseline: 0.59
| Reranker | Top-3 | Top-5 | Top-10 |
|---|---|---|---|
| Linear Combination | 0.68 | 0.70 | 0.84 |
| Cross Encoder | 0.72 | 0.72 | 0.79 |
| Cohere | 0.79 | 0.79 | 0.84 |
| ColBERT | 0.70 | 0.70 | 0.76 |