Skip to main content
Pydantic is a data validation library in Python. LanceDB integrates with Pydantic for schema inference, data ingestion, and query result casting. Using lancedb.pydantic.LanceModel, users can seamlessly integrate Pydantic with the rest of the LanceDB APIs. First, import the necessary LanceDB and Pydantic modules: Next, define your Pydantic model by inheriting from LanceModel and specifying your fields including a vector field: Set the database connection URL: Now you can create a table, add data, and perform vector search operations:

Vector Field

LanceDB provides a lancedb.pydantic.Vector method to define a vector Field in a Pydantic Model. This example demonstrates how LanceDB automatically converts Pydantic field types to their corresponding Apache Arrow data types. The pydantic_to_schema() function takes a Pydantic model and generates an Arrow schema where:
  • int fields become pa.int64() (64-bit integers)
  • str fields become pa.utf8() (UTF-8 encoded strings)
  • Vector(768) becomes pa.list_(pa.float32(), 768) (fixed-size list of 768 float32 values)
  • The False parameter indicates that the fields are not nullable

Type Conversion

LanceDB automatically convert Pydantic fields to Apache Arrow DataType. Current supported type conversions:
Pydantic Field TypePyArrow Data Type
intpyarrow.int64
floatpyarrow.float64
boolpyarrow.bool
strpyarrow.utf8()
listpyarrow.List
BaseModelpyarrow.Struct
Vector(n)pyarrow.FixedSizeList(float32, n)
LanceDB supports to create Apache Arrow Schema from a pydantic.BaseModel via lancedb.pydantic.pydantic_to_schema method. This example shows a more complex Pydantic model with various field types and demonstrates how LanceDB handles:
  • Basic types: int and str fields
  • Vector fields: Vector(1536) creates a fixed-size list of 1536 float32 values
  • List fields: List[int] becomes a variable-length list of int64 values
  • Schema generation: The pydantic_to_schema() function automatically converts all these types to their Arrow equivalents