High-performance random access and data management for model training
Use LanceDB to curate, explore and distribute very large multimodal datasets for training and fine-tuning models.
LanceDB comes with built-in table versioning, schema evolution, and fast random access, making it far more efficient to do
dataset slicing, sampling, filtering and shuffles on large, rapidly evolving datasets.
Massively scalable, fast and high-quality retrieval − without breaking the bank
Use LanceDB as the data + retrieval layer for production AI workloads: RAG, agents, semantic search,
recommendation systems, and more.
Keep multimodal data, metadata, and embeddings in the same table and query them via vector search,
full-text search or SQL. Easily add new features (columns in your tables) as your
application evolves, without copying existing data.

Use cases
- Search: Build high-performance search and retrieval applications using LanceDB’s optimized storage, including vector search, full-text search, and hybrid search with secondary indexes.
- Data Curation: Manage and filter on petabyte-scale multimodal datasets, including video and point cloud data, to gain insights, explore data and inform model development.
- Feature engineering: Add new columns (features), create embeddings, and transform your data at scale. LanceDB lets you extend tables both vertically and horizontally with minimal I/O overhead.
- Training: Efficiently access and manage large-scale multimodal datasets for training and fine-tuning AI models.
Choose how you run LanceDB
Depending on your needs, you can choose one of the following ways to run LanceDB.1. LanceDB OSS
The fastest way to get started is the open-source embedded library, with client SDKs in Python, TypeScript and Rust. Run it locally during development, then use the same data model and APIs as you scale up and need a managed solution. Start here:Quickstart
Get started with LanceDB in minutes.
Basic Table Operations
Create tables, search vectors, and modify data in LanceDB.
2. LanceDB Enterprise
LanceDB Enterprise is a distributed and managed multimodal lakehouse built for search, curation, feature engineering, and training-oriented data access workflows on top of the same core table abstraction. This eliminates the need for teams to build bespoke infrastructure to manage petabyte-scale multimodal datasets. To set up LanceDB Enterprise in your organization, reach out to us at contact@lancedb.com.Built with scale, performance, and security in mind.LanceDB Enterprise is designed for very large-scale, high-performance, distributed workloads in
private deployments, and can operate under strict security requirements.
Quickstart
Get started with LanceDB Enterprise in minutes.