Skip to main content
Enterprise-only When working with multimodal data at scale, LanceDB Enterprise makes it easy to define, extract, and transform raw data into useful information and features for your AI applications. LanceDB Enterprise’s Multimodal Feature Engineering package is designed to improve the productivity of AI engineers operating at immense scale. With an API designed to leverage LanceDB’s optimized data storage and retrieval, it streamlines prototyping extraction and transformation tasks, performing experiments, exploring your data, scaling up execution, and moving to production.

Feature Engineering and the geneva Python package are currently only available as part of LanceDB Enterprise. Please contact us if you’re interested in scaling up your feature engineering workloads for your AI and multimodal use cases.
The geneva package uses Python User Defined Functions (UDFs) to define features as columns in a Lance dataset. Adding a feature is straightforward:
1
Prototype your Python function in your favorite environment.
2
Wrap the function with a small UDF decorator (see UDFs).
3
Register the UDF as a virtual column using Table.add_columns().
4
Trigger a backfill operation (see Backfilling).
You can build your Python feature generator function in an IDE or a notebook using your project’s Python versions and dependencies. geneva will automate much of the dependency and version management needed to move from prototype to scale and production.

Continue learning

Visit the following pages to learn more about featuring engineering in LanceDB Enterprise: