LanceDB provides performant full-text search based on BM25, allowing you to incorporate keyword-based search in your retrieval solutions. This page shows examples on how to create and configure FTS indexes in LanceDB OSS and Enterprise, using the synchronous and asynchronous APIs.Documentation Index
Fetch the complete documentation index at: https://docs.lancedb.com/llms.txt
Use this file to discover all available pages before exploring further.
In LanceDB Enterprise,
create_fts_index API returns immediately, but index building happens asynchronously.Creating FTS Indexes
Synchronous API
Usecreate_fts_index with synchronous LanceDB connections:
Check FTS index status using the API:
Asynchronous API
When using async connections (connect_async), use create_index with the FTS configuration:
The
create_fts_index method is not available on AsyncTable. Use create_index with FTS config instead.Configuration Options
FTS Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
with_position | bool | False | Store token positions (required for phrase queries) |
base_tokenizer | str | "simple" | Text splitting method (simple, whitespace, or raw) |
language | str | "English" | Language for stemming/stop words |
max_token_length | int | 40 | Maximum token size; longer tokens are omitted |
lower_case | bool | True | Lowercase tokens |
stem | bool | True | Apply stemming (running → run) |
remove_stop_words | bool | True | Drop common stop words |
ascii_folding | bool | True | Normalize accented characters |
custom_stop_words | list[str] | None | Extra stop words to drop in addition to the language defaults. Requires remove_stop_words=True. |
min_ngram_length | int | 3 | Minimum n-gram length. Applies only when base_tokenizer="ngram". |
max_ngram_length | int | 3 | Maximum n-gram length. Applies only when base_tokenizer="ngram". |
prefix_only | bool | False | Index only prefix n-grams rather than all substrings. Applies only when base_tokenizer="ngram". |
max_token_lengthcan filter out base64 blobs or long URLs.- Disabling
with_positionreduces index size but disables phrase queries. ascii_foldinghelps with international text (e.g., “café” → “cafe”).
Phrase Query Configuration
Enable phrase queries by setting:| Parameter | Required Value | Purpose |
|---|---|---|
with_position | True | Track token positions for phrase matching |
remove_stop_words | False | Preserve stop words for exact phrase matching |