Documentation Index
Fetch the complete documentation index at: https://docs.lancedb.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Geneva backfills operate on a point-in-time snapshot of your table. When other operations modify the table during or between backfills, conflicts can occur. Geneva >=0.9.0 automatically handles most conflict scenarios, reducing unnecessary recomputation and enabling graceful recovery.Safe Operations During Backfill
These operations can safely run while a backfill is in progress:| Operation | Why It’s Safe |
|---|---|
merge_insert (Insert-only) | Creates new fragments without modifying existing ones |
add() / append data | Creates new fragments without modifying existing ones |
Read operations (search, to_arrow) | Read-only, no fragment modification |
| Adding new columns | Schema change only, no fragment rewrite |
Operations That Cause Conflicts
These operations can conflict with running backfills:| Operation | What Happens |
|---|---|
compact_files() / optimize() | Reorganizes fragments, invalidating the backfill’s snapshot |
merge_insert with updates | Modifies existing rows, causing fragment conflicts |
delete() | Modifies existing fragments |
How Geneva Handles Conflicts
Concurrent Backfills on Different Columns
When multiple backfills run on the same table but different columns, Geneva handles version conflicts automatically:- Each backfill writes to different column files (field IDs)
- If a commit conflict occurs, Geneva retries at the latest version
- The retry merges the new column data without overwriting other columns
GENEVA_VERSION_CONFLICT_MAX_RETRIES environment variable (default: 10). See Advanced Configuration for details.
Compaction Between Backfills
When you run compaction between backfills (not during), Geneva handles it efficiently:| Scenario | Behavior |
|---|---|
| Backfill, compact, re-backfill (same UDF) | Already-computed rows are skipped via WHERE <col> IS NULL |
| Partial backfill, compact, resume | Incremental processing continues from where it left off |
Backfill, alter_columns (new UDF), re-backfill | Full reprocessing with new UDF (intentional) |
Geneva’s default behavior is to skip rows that already have values (
WHERE <col> IS NULL). This means compaction doesn’t cause unnecessary recomputation.Recovery Steps
When a conflict occurs during a backfill:- Wait for any concurrent operations (compaction, updates) to complete
- Re-run the backfill:
- Only uncomputed rows will be processed (rows with NULL values in the target column)
Best Practices
Sequence Your Operations
For the smoothest experience, sequence your operations:Use Insert-Only Operations During Backfill
If you need to add data while a backfill is running, use insert-only operations:Monitor Backfill Progress
Use async backfills to monitor progress and handle errors:Disable Auto-Compaction During Large Backfills
If using LanceDB Enterprise which has auto-compaction enabled, consider disabling it during large backfill operations to avoid conflicts.Related
- Backfilling - Triggering and configuring backfill operations
- Advanced Configuration - Environment variables for retry behavior
API Reference
- Table —
backfill(),add(),merge_insert(), and other table mutation methods