Schema evolution enables non-breaking modifications to a database table’s structure — such as adding columns, altering data types, or dropping fields — to adapt to evolving data requirements without service interruptions. LanceDB supports ACID-compliant schema evolution through granular operations (add/alter/drop columns), allowing you to:
LanceDB supports three primary schema evolution operations:
Schema evolution operations are applied immediately but do not typically require rewriting all data. However, data type changes may involve more substantial operations.
You can add new columns to a table with the add_columns
method in Python or addColumns
in TypeScript/JavaScript.
New columns are populated based on SQL expressions you provide.
When adding columns that should contain NULL values, be sure to cast the NULL
to the appropriate type, e.g., cast(NULL as timestamp)
.
You can alter columns using the alter_columns
method in Python or alterColumns
in TypeScript/JavaScript. This allows you to:
Changing data types requires rewriting the column data and may be resource-intensive for large tables. Renaming columns or changing nullability is more efficient as it only updates metadata.
You can remove columns using the drop_columns
method in Python or [dropColumns
] in TypeScript/JavaScript(https://lancedb.github.io/lancedb/js/classes/Table/#altercolumns).
Dropping columns cannot be undone. Make sure you have backups or are certain before removing columns.
Vector columns (used for embeddings) have special considerations. When altering vector columns, you should ensure consistent dimensionality.
A common schema evolution task is converting a generic list column to a fixed-size list for performance:
Schema evolution enables non-breaking modifications to a database table’s structure — such as adding columns, altering data types, or dropping fields — to adapt to evolving data requirements without service interruptions. LanceDB supports ACID-compliant schema evolution through granular operations (add/alter/drop columns), allowing you to:
LanceDB supports three primary schema evolution operations:
Schema evolution operations are applied immediately but do not typically require rewriting all data. However, data type changes may involve more substantial operations.
You can add new columns to a table with the add_columns
method in Python or addColumns
in TypeScript/JavaScript.
New columns are populated based on SQL expressions you provide.
When adding columns that should contain NULL values, be sure to cast the NULL
to the appropriate type, e.g., cast(NULL as timestamp)
.
You can alter columns using the alter_columns
method in Python or alterColumns
in TypeScript/JavaScript. This allows you to:
Changing data types requires rewriting the column data and may be resource-intensive for large tables. Renaming columns or changing nullability is more efficient as it only updates metadata.
You can remove columns using the drop_columns
method in Python or [dropColumns
] in TypeScript/JavaScript(https://lancedb.github.io/lancedb/js/classes/Table/#altercolumns).
Dropping columns cannot be undone. Make sure you have backups or are certain before removing columns.
Vector columns (used for embeddings) have special considerations. When altering vector columns, you should ensure consistent dimensionality.
A common schema evolution task is converting a generic list column to a fixed-size list for performance: