Skip to main content
We support ColPali model embeddings for multimodal multi-vector retrieval. ColPali produces multiple embedding vectors per input (multi-vector), enabling more nuanced similarity matching between text queries and image documents. Using ColPali requires the colpali-engine package, which can be installed using pip install colpali-engine.
ColPali produces multi-vector embeddings, meaning each input generates multiple embedding vectors rather than a single vector. Use MultiVector(func.ndims()) instead of Vector(func.ndims()) when defining your schema.
Supported models are:
  • Metric-AI/ColQwen2.5-3b-multilingual-v1.0 (default)
  • vidore/colpali-v1.3
  • vidore/colqwen2-v1.0
  • vidore/colSmol-256M
Supported parameters (to be passed in create method) are:
ParameterTypeDefault ValueDescription
model_namestr"Metric-AI/ColQwen2.5-3b-multilingual-v1.0"The name of the model to use.
devicestr"auto"The device for inference. Can be "auto", "cpu", "cuda", or "mps".
dtypestr"bfloat16"Data type for model weights (bfloat16, float16, float32, float64).
pooling_strategystr"hierarchical"Token pooling strategy: "hierarchical", "lambda", or None.
pool_factorint2Factor to reduce sequence length when pooling is enabled.
batch_sizeint2Batch size for processing inputs.
quantization_configOptional[BitsAndBytesConfig]NoneQuantization configuration for the model (requires bitsandbytes).
This embedding function supports ingesting images as both bytes and URLs. You can query them using text. Now we can search using text queries: