LeRobot is Hugging Face’s open-source robotics stack for collecting data, training policies, running simulations, and sharing robotics datasets and models on the Hub. LeRobotDataset v3.0 standardizes robot learning data across sensorimotor time series, actions, multi-camera video, and task metadata. Its v3 layout stores high-frequency tabular signals in Parquet, visual streams as MP4 shards, and metadata that reconstructs episode-level views from larger files. Lance is useful next to LeRobot when you want high-performance random access, lazy multimodal blob reads, and a single table interface for curation, search, and training data preparation. TheDocumentation Index
Fetch the complete documentation index at: https://docs.lancedb.com/llms.txt
Use this file to discover all available pages before exploring further.
lerobot-lancedb package provides Lance-backed LeRobotDataset subclasses, and LanceDB can also open Lance-formatted LeRobot datasets on the Hub directly through hf:// URIs.
Install
Use Lance-backed LeRobotDataset loaders
UseLeRobotLanceDataset when your Lance-backed dataset stores decoded image observations. It is intended as a drop-in replacement for LeRobotDataset, so existing policy training code can keep using standard PyTorch dataset and dataloader patterns.
For datasets that store camera observations as MP4 video segments, use LeRobotLanceVideoDataset instead.
Use the image loader for Lance-backed repos that store image frames. Use the video loader for MP4-backed LeRobot datasets such as
lance-format/lerobot-pusht-lance.Open LeRobot Lance tables with LanceDB
Lance-formatted LeRobot datasets published bylance-format expose each .lance file under data/ as a LanceDB table. The PushT dataset, for example, has frames, episodes, and videos tables.
This is useful when you want to inspect schemas, count rows, sample metadata, or build curation workflows before handing the selected data to a training loop.
Filter a frame window
Robotics workflows often need deterministic slices byepisode_index, frame_index, or task metadata before any model training starts. LanceDB can filter those rows without reading video blobs.
From there, you can materialize a smaller local LanceDB database, add derived columns, attach embeddings, or build vector and scalar indexes for faster repeated access.
Example Lance-formatted LeRobot datasets
LeRobot PushT
A Lance-formatted version of
lerobot/pusht with frame, episode, and video tables.LeRobot X-VLA Soft-Fold
A multi-camera robotics dataset packaged as Lance tables for frame-level and episode-level access.
More resources
LeRobotDataset v3.0
Hugging Face’s guide to the v3 dataset layout, streaming, transforms, and migration.
lerobot-lancedb
API documentation for the Lance-backed LeRobotDataset implementations.
When to use each interface
| Interface | Best for |
|---|---|
LeRobotDataset | Standard LeRobot training loops and policy code |
LeRobotLanceDataset | Drop-in training on Lance-backed image datasets |
LeRobotLanceVideoDataset | Drop-in training on Lance-backed video datasets |
| LanceDB | Interactive inspection, filtering, curation, search, indexing, and materializing subsets |
lance.dataset(...) | Lower-level schema, fragment, index, and blob access |