Geneva supports UDFs that take Lance Blobs (large binary objects) as input and has the ability to write out columns with binaries encoded as Lance Blobs. Lance blobs are an optimization intended for large objects (1’s MBs -> 100MB’s) and provide a file-like object that lazily reads large binary objects.Documentation Index
Fetch the complete documentation index at: https://docs.lancedb.com/llms.txt
Use this file to discover all available pages before exploring further.
Reading Blobs
Defining functions that read blob columns is straight forward. For scalar UDFs, blob columns are expected to be of typeBlobFile
Writing Blobs
Defining UDFs that write outBlobs to a new column is straightforward. Here we add the standard metadata annotation to the UDF so that Geneva knows to write out Blobs.
For scalar udfs, your udf will return bytes, explicitly set the data_type to pa.large_binary(), and add the field_metadata that specifies blob encoding.
pa.RecordBatch batched UDFs you the effort is similar: