- the default cluster — the compute pool jobs run on, and
- the default manifest — the Python dependency environment (image and packages) the distributed workers run with.
To override the cluster a job runs on — for example to route an embedding backfill to a
GPU pool — see Advanced Execution Contexts.
Pinning a dependency manifest
A manifest pins the Python image and packages the distributed workers run with. Build one with the manifest builders, then attach it to your transform with themanifest= argument on
@udf, @chunker, or @udtf. The manifest is snapshotted into the column (or view) metadata
when the transform is registered, so every backfill or refresh of that transform uses it
automatically — there is no per-call manifest argument to remember.
Manifests are immutable at the column / view level. When a transform is registered, its
manifest is snapshotted onto the column (or view) metadata. Changing the deployment-default
manifest — or the
GenevaManifest object in your code — does not affect existing columns
or views: they keep using the snapshot taken at creation time. To move a column or view to a
new manifest, re-point it to a new (or updated) UDF / chunker / UDTF — for example with
alter_columns() for a column, or by recreating the view.@udf(manifest=...)
Pin dependencies for a 1:1 computed column:
@chunker(manifest=...)
Pin dependencies for a 1:N chunker (scalar UDTF):
@udtf(manifest=...)
Pin dependencies for an N:M batch UDTF:
Capturing your local environment for testing
When iterating locally, you often want the workers to run with the exact packages from your current environment rather than a curated pip list.Connection.capture_local_environment()
zips your workspace (and, optionally, your site-packages), uploads the archives through the
connection, and returns a ready-to-use GenevaManifest you can attach to a transform with
manifest=.
skip_site_packages=False (the default) to also upload your local site-packages.
Manifest resolution
For a given transform, the manifest is resolved in this order (first match wins):- The manifest pinned on the transform via
@udf/@chunker/@udtfmanifest=. - For a materialized view, the manifest snapshotted on the view when it was created.
- The deployment-default manifest from the LanceDB Helm chart.
The
manifest= argument applies to managed enterprise (db://) jobs. For direct
object-storage or local-filesystem connections, configure the dependency environment
explicitly with an Advanced Execution Context instead.