Reranker class and implementing the
rerank_hybrid() method. Optionally, you can also implement the rerank_vector() and rerank_fts()
methods if you want to support reranking for vector and FTS search separately.
Interface
TheReranker base interface comes with a merge_results() method that can be used to combine the
results of semantic and full-text search. This is a vanilla merging algorithm that simply concatenates
the results and removes the duplicates without taking the scores into consideration. It only keeps the
first copy of the row encountered. This works well in cases that don’t require the scores of semantic
and full-text search to combine the results. If you want to use the scores or want to support
return_score="all", you’ll need to implement your own merging algorithm.
Below, we show the pseudocode of a custom reranker that combines the results of semantic and full-text
search using a linear combination of the scores:
Example
As an example, let’s build custom reranker that enhances the Cohere Reranker by accepting a filter query, and accepts any otherCohereReranker params as kwargs.
Under the hood,
vector_results and fts_results are PyArrow tables. You can learn more about
PyArrow tables here. The advantage of PyArrow tables is their
interoperability — you can easily convert them to Pandas/Polars DataFrames, PyDict, PyList, etc.The benefits are also bidirectional — just as you can easily convert PyArrow tables to Pandas
DataFrames using the to_pandas() method — you can perform DataFrame transformations
and just as easily convert the DataFrame back to PyArrow tables using pa.Table.from_pandas() method
as shown in the example above.