Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.lancedb.com/llms.txt

Use this file to discover all available pages before exploring further.

https://mintcdn.com/lancedb-bcbb4faf/6L0IRVkfdlgMU1Pw/static/assets/logo/huggingface-logo.svg?fit=max&auto=format&n=6L0IRVkfdlgMU1Pw&q=85&s=da940a105a40440f0cd1224d3fa4ae52

View on Hugging Face

Source dataset card and downloadable files for lance-format/kitti-2d-detection-lance.
Lance-formatted version of the KITTI 2D Object Detection benchmark — 7,481 training images from the KITTI Vision Benchmark Suite with 2D bounding boxes plus the full 3D-box / observation-angle metadata. Sourced from nateraw/kitti so no manual signup or download from cvlibs.net is required. KITTI is the canonical autonomous-driving 2D / 3D detection benchmark — useful for AV perception research, robust real-world benchmarking, and as a small-scale companion to nuScenes / Waymo.

Splits

SplitRows
train.lance7,481
(The test split has no labels published, so we omit it. Add it back via --splits train test if you want the unlabeled images as well.)

Schema

ColumnTypeNotes
idint64Row index within split
imagelarge_binaryInline JPEG bytes (re-encoded from the source PNG)
bboxeslist<list<float32, 4>>2D box per object — [left, top, right, bottom] in pixel coords
alphaslist<float32>Observation angle (radians, KITTI convention)
dimensionslist<list<float32, 3>>3D box (h, w, l) in metres
locationslist<list<float32, 3>>3D centre (x, y, z) in camera coords (metres)
rotation_ylist<float32>Yaw angle in camera coords (radians)
occludedlist<int8>KITTI occlusion flag (0=visible, 1=partly, 2=largely, 3=unknown)
truncatedlist<float32>Truncation fraction (0.0-1.0)
typeslist<string>Class name per object (e.g. Car, Pedestrian, Cyclist, DontCare)
num_objectsint32Number of annotated objects
types_presentlist<string>Deduped class names — feeds the LABEL_LIST index
image_embfixed_size_list<float32, 512>OpenCLIP ViT-B-32 image embedding (cosine-normalized)

Pre-built indices

  • IVF_PQ on image_embmetric=cosine
  • BTREE on num_objects
  • LABEL_LIST on types_present

Quick start

import lance

ds = lance.dataset("hf://datasets/lance-format/kitti-2d-detection-lance/data/train.lance")
print(ds.count_rows(), ds.schema.names, ds.list_indices())

Load with LanceDB

These tables can also be consumed by LanceDB, the multimodal lakehouse and embedded search library built on top of Lance, for simplified vector search and other queries.
import lancedb

db = lancedb.connect("hf://datasets/lance-format/kitti-2d-detection-lance/data")
tbl = db.open_table("train")
print(f"LanceDB table opened with {len(tbl)} frames")

Read a frame with annotations

import io
import lance
from PIL import Image, ImageDraw

ds = lance.dataset("hf://datasets/lance-format/kitti-2d-detection-lance/data/train.lance")
row = ds.take([0], columns=["image", "bboxes", "types"]).to_pylist()[0]

img = Image.open(io.BytesIO(row["image"])).convert("RGB")
draw = ImageDraw.Draw(img)
for (l, t, r, b), cls in zip(row["bboxes"], row["types"]):
    if cls == "DontCare":
        continue
    draw.rectangle([l, t, r, b], outline="lime", width=2)
    draw.text((l + 4, t + 2), cls, fill="lime")
img.save("kitti.jpg")

Filter by classes

import lance
ds = lance.dataset("hf://datasets/lance-format/kitti-2d-detection-lance/data/train.lance")

# Frames containing both a Car and a Cyclist (LABEL_LIST index makes this fast).
both = ds.scanner(
    filter="array_has_all(types_present, ['Car', 'Cyclist'])",
    columns=["id", "types_present"],
    limit=10,
).to_table()

# Frames with at least 10 objects (for crowded-scene experiments).
crowded = ds.scanner(filter="num_objects >= 10", columns=["id"], limit=10).to_table()

Filter by classes with LanceDB

import lancedb

db = lancedb.connect("hf://datasets/lance-format/kitti-2d-detection-lance/data")
tbl = db.open_table("train")

both = (
    tbl.search()
    .where("array_has_all(types_present, ['Car', 'Cyclist'])")
    .select(["id", "types_present"])
    .limit(10)
    .to_list()
)

crowded = (
    tbl.search()
    .where("num_objects >= 10")
    .select(["id"])
    .limit(10)
    .to_list()
)
import lance
import pyarrow as pa

ds = lance.dataset("hf://datasets/lance-format/kitti-2d-detection-lance/data/train.lance")
emb_field = ds.schema.field("image_emb")
ref = ds.take([0], columns=["image_emb"]).to_pylist()[0]["image_emb"]
query = pa.array([ref], type=emb_field.type)

neighbors = ds.scanner(
    nearest={"column": "image_emb", "q": query[0], "k": 5, "nprobes": 16, "refine_factor": 30},
    columns=["id", "types_present"],
).to_table().to_pylist()
import lancedb

db = lancedb.connect("hf://datasets/lance-format/kitti-2d-detection-lance/data")
tbl = db.open_table("train")

ref = tbl.search().limit(1).select(["image_emb"]).to_list()[0]
query_embedding = ref["image_emb"]

results = (
    tbl.search(query_embedding)
    .metric("cosine")
    .select(["id", "types_present"])
    .limit(5)
    .to_list()
)

Why Lance?

  • One dataset for images + 2D + 3D annotations + embeddings + indices — no parallel image_2/ and label_2/ folders.
  • On-disk vector and label-list indices live next to the data, so search and class-based filtering work on local copies and on the Hub.
  • Schema evolution: add columns (LIDAR features, alternative embeddings, model predictions) without rewriting the data.

Source & license

Converted from nateraw/kitti. KITTI is released under the CC BY-NC-SA 3.0 license by Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago — non-commercial research use only. See the KITTI license page for details.

Citation

@inproceedings{geiger2012are,
  title={Are we ready for autonomous driving? The KITTI vision benchmark suite},
  author={Geiger, Andreas and Lenz, Philip and Urtasun, Raquel},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2012}
}