Pydantic is a data validation library in Python. LanceDB integrates with Pydantic for schema inference, data ingestion, and query result casting. UsingDocumentation Index
Fetch the complete documentation index at: https://docs.lancedb.com/llms.txt
Use this file to discover all available pages before exploring further.
lancedb.pydantic.LanceModel, users can seamlessly
integrate Pydantic with the rest of the LanceDB APIs.
First, import the necessary LanceDB and Pydantic modules:
Next, define your Pydantic model by inheriting from LanceModel and specifying your fields including a vector field:
Set the database connection URL:
Now you can create a table, add data, and perform vector search operations:
Vector Field
LanceDB provides alancedb.pydantic.Vector method to define a
vector Field in a Pydantic Model.
This example demonstrates how LanceDB automatically converts Pydantic field types to their corresponding Apache Arrow data types. The pydantic_to_schema() function takes a Pydantic model and generates an Arrow schema where:
intfields becomepa.int64()(64-bit integers)strfields becomepa.utf8()(UTF-8 encoded strings)Vector(768)becomespa.list_(pa.float32(), 768)(fixed-size list of 768 float32 values)- The
Falseparameter indicates that the fields are not nullable
Type Conversion
LanceDB automatically convert Pydantic fields to Apache Arrow DataType. Current supported type conversions:| Pydantic Field Type | PyArrow Data Type |
|---|---|
int | pyarrow.int64 |
float | pyarrow.float64 |
bool | pyarrow.bool |
str | pyarrow.utf8() |
list | pyarrow.List |
BaseModel | pyarrow.Struct |
Vector(n) | pyarrow.FixedSizeList(float32, n) |
pydantic.BaseModel
via lancedb.pydantic.pydantic_to_schema method.
This example shows a more complex Pydantic model with various field types and demonstrates how LanceDB handles:
- Basic types:
intandstrfields - Vector fields:
Vector(1536)creates a fixed-size list of 1536 float32 values - List fields:
List[int]becomes a variable-length list of int64 values - Schema generation: The
pydantic_to_schema()function automatically converts all these types to their Arrow equivalents