- You have a large dataset with relatively high-dimensional vectors (512, 768, 1024+)
- Index build time and query latency matter
IVF_PQ— Inverted File index with Product Quantization (default). See the vector indexing guide forIVF_PQexamples.IVF_RQ— Inverted File index with RaBitQ quantization (binary, 1 bit per dimension). See below for details.
IVF_PQ is the default indexing option in LanceDB and works well in many cases. However, in cases where more drastic compression is needed, RaBitQ is also a reasonable option.
RaBitQ quantization
RaBitQ is a binary quantization method that represents each normalized embedding using 1 bit per dimension, plus a couple of small corrective scalars. In practice, a 1,024-dimensionalfloat32 vector that would normally take 4 KB can be compressed to roughly a few hundred bytes with RaBitQ, while still maintaining reasonable recall.
How RaBitQ works
- Embeddings are grouped around centroids (as in other IVF indexes).
- Each residual vector is normalized and mapped to the nearest vertex of a randomly rotated hypercube on the unit sphere.
- The sign pattern of that vector is stored as bits (1 bit per dimension).
- Two small corrective factors are stored:
- The distance from the original vector to its centroid
- The dot product between the normalized vector and its quantized version
IVF_PQ, RaBitQ:
- Avoids training expensive PQ codebooks
- Builds indexes faster and handles updates more easily
- Maintains or improves recall at high dimensionality under the same storage budget
Using RaBitQ
You can create an RaBitQ-backed vector index by settingindex_type="IVF_RQ" when calling create_index.
num_bits controls how many bits per dimension are used:
API Reference
1 bit is the classic RaBitQ setting, but you could (at higher computational cost) set it to 2, 4 or 8 bits if you want to improve the fidelity for better precision or recall. It’s also possible to tune the number of IVF partitions inIVF_RQ, similar to how you would do in IVF_PQ.
The full list of parameters to the algorithm are listed below.
distance_type: Literal[“l2”, “cosine”, “dot”], defaults to “l2”
The distance metric to use for similarity comparison. Choose “l2” for Euclidean, “cosine” for cosine similarity, or “dot” for dot product.num_partitions: Optional[int], defaults to None
Number of IVF partitions (affects index build time and query accuracy). More partitions can improve recall but may increase build time.num_bits: int, defaults to 1
Bits per dimension for quantization (1 is standard RaBitQ). Higher values improve fidelity at the cost of more storage and computation.max_iterations: int, defaults to 50
Maximum number of iterations for training the quantizer. Increase for larger datasets or to improve quantization quality.sample_rate: int, defaults to 256
Number of samples per partition during training. Higher values may improve accuracy but increase training time.target_partition_size: Optional[int], defaults to None
Target number of vectors per partition. Adjust to control partition granularity and memory usage.