Skip to main content
LanceDB integrates with DuckDB through the Lance extension for DuckDB. In this page, we’ll show how LanceDB manages table lifecycle, and DuckDB provides SQL analytics (including joins) and search over those tables. Note that earlier versions of LanceDB used to recommend converting Lance tables to Arrow tables via table.to_arrow(). Although this method is still available (because DuckDB natively scans Arrow tables), it is no longer the recommended workflow for working with Lance tables in DuckDB. This page shows how to use the Lance extension with namespace-attached LanceDB tables, allowing you to pushdown SQL queries directly to the Lance layer.

Install

Install the DuckDB CLI as per their docs and alternatively, their Python package with pip install duckdb. Then, open the DuckDB CLI and install and load the Lance extension as follows:
SQL
INSTALL lance;
LOAD lance;

Attach the directory namespace in DuckDB

Attach the LanceDB root directory as a Lance namespace:
SQL
ATTACH './local_lancedb' AS lance_ns (TYPE LANCE);
In this page, tables are referenced using lance_ns.main.<table_name>, so the table path is lance_ns.main.lance_duck.

Write Lance table

Create the lance_duck table using SQL and populate it with sample data:
SQL
CREATE OR REPLACE TABLE lance_ns.main.lance_duck AS
SELECT *
FROM (
  VALUES
    ('duck', 'quack', [0.9, 0.7, 0.1]::FLOAT[]),
    ('horse', 'neigh', [0.3, 0.1, 0.5]::FLOAT[]),
    ('dragon', 'roar', [0.5, 0.2, 0.7]::FLOAT[])
) AS t(animal, noise, vector);
This table is the source of truth for all DuckDB queries below.
The examples below show SQL entered in the DuckDB CLI. You can run the same SQL from Python as well, using LanceDB and DuckDB’s Python clients in your application code.

Query the table with SQL

SQL
SELECT *
  FROM lance_ns.main.lance_duck
  LIMIT 5;
SQL
SELECT animal, noise, vector, _distance
  FROM lance_vector_search(
    'lance_ns.main.lance_duck',
    'vector',
    [0.8, 0.7, 0.2]::FLOAT[],
    k = 1,
    prefilter = true
  )
  ORDER BY _distance ASC;
SQL
SELECT animal, noise, vector, _score
  FROM lance_fts(
    'lance_ns.main.lance_duck',
    'animal',
    'the brave knight faced the dragon',
    k = 1,
    prefilter = true
  )
  ORDER BY _score DESC;
SQL
SELECT animal, noise, vector, _hybrid_score, _distance, _score
  FROM lance_hybrid_search(
    'lance_ns.main.lance_duck',
    'vector',
    [0.8, 0.7, 0.2]::FLOAT[],
    'animal',
    'the duck surprised the dragon',
    k = 2,
    prefilter = false,
    alpha = 0.5,
    oversample_factor = 4
  )
  ORDER BY _hybrid_score DESC;

Directory namespace model

A directory namespace maps a LanceDB catalog root to namespace-qualified table identifiers in DuckDB. This keeps table discovery and table naming stable as your project grows. To learn more about the catalog and namespace model, see Namespaces and the Catalog Model.

Advanced usage

See the docs directory in the Lance-DuckDB extension repo for more advanced usage on SQL, Cloud and REST API clients.