> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lancedb.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Embeddings: Quickstart

> Quickstart guide for generating and working with embeddings.

export const TsQuickstartSchema = "const wordsSchema = lancedb.embedding.LanceSchema({\n  text: model.sourceField(new Utf8()),\n  vector: model.vectorField(),\n});\n";

export const TsQuickstartQuery = "const query = \"greetings\";\nconst actual = (await table.search(query).limit(1).toArray())[0];\nconsole.log(actual.text);\n";

export const TsQuickstartInitModel = "const model = (await lancedb.embedding\n  .getRegistry()\n  .get(\"huggingface\")\n  ?.create()) as lancedb.embedding.EmbeddingFunction;\n";

export const TsQuickstartImports = "import * as lancedb from \"@lancedb/lancedb\";\nimport \"@lancedb/lancedb/embedding/transformers\";\nimport { Utf8 } from \"apache-arrow\";\n";

export const TsQuickstartCreateTable = "const table = await db.createEmptyTable(\"words\", wordsSchema, {\n  mode: \"overwrite\",\n});\nawait table.add([{ text: \"hello world\" }, { text: \"goodbye world\" }]);\n";

export const TsQuickstartConnect = "const db = await lancedb.connect(\"data/sample-lancedb\");\n";

LanceDB will automatically vectorize the data both at ingestion and query time. All you need to do is specify which model to use.
Popular embedding models like OpenAI, Hugging Face, Sentence Transformers, CLIP, and more, are supported.

## Step 1: Import Required Libraries

First, import the necessary LanceDB components:

<CodeGroup>
  ```python Python icon="python" theme={"theme":{"light":"vitesse-light","dark":"catppuccin-mocha"}}
  import lancedb
  from lancedb.pydantic import LanceModel, Vector
  from lancedb.embeddings import get_registry
  ```

  <CodeBlock filename="TypeScript" language="typescript" icon="square-js">
    {TsQuickstartImports}
  </CodeBlock>
</CodeGroup>

* `lancedb`: The main database connection and operations
* `LanceModel`: Pydantic model for defining table schemas
* `Vector`: Field type for storing vector embeddings
* `get_registry()`: Access to the embedding function registry. It has all the supported as well as custom embedding functions registered by the user
* TypeScript uses `lancedb.embedding.getRegistry()` and `lancedb.embedding.LanceSchema()` for the same registry/schema workflow

## Step 2: Connect to LanceDB

Establish a connection to your LanceDB OSS directory or Enterprise cluster:

<CodeGroup>
  ```python Python icon="python" theme={"theme":{"light":"vitesse-light","dark":"catppuccin-mocha"}}
  # Enter your LanceDB connection URI for OSS or Enterprise here
  db = lancedb.connect(...)
  ```

  <CodeBlock filename="TypeScript" language="typescript" icon="square-js">
    {TsQuickstartConnect}
  </CodeBlock>
</CodeGroup>

## Step 3: Initialize the Embedding Function

Choose and configure your embedding model:

<CodeGroup>
  ```python Python icon="python" theme={"theme":{"light":"vitesse-light","dark":"catppuccin-mocha"}}
  model = get_registry().get("sentence-transformers").create(name="BAAI/bge-small-en-v1.5", )
  ```

  <CodeBlock filename="TypeScript" language="typescript" icon="square-js">
    {TsQuickstartInitModel}
  </CodeBlock>
</CodeGroup>

This creates an embedding function from the local embedding registry. The Python snippet uses the
`sentence-transformers` provider with the BGE model; the TypeScript snippet uses the Transformers-backed
`huggingface` provider. You can:

* Change `"sentence-transformers"` to other providers like `"openai"`, `"cohere"`, etc.
* Modify the model name for different embedding models
* Set `device="cuda"` for GPU acceleration if available

## Step 4: Define Your Schema

Create a Pydantic model that defines your table structure:

<CodeGroup>
  ```python Python icon="python" theme={"theme":{"light":"vitesse-light","dark":"catppuccin-mocha"}}
  class Words(LanceModel):
      text: str = model.SourceField()  
      vector: Vector(model.ndims()) = model.VectorField()
  ```

  <CodeBlock filename="TypeScript" language="typescript" icon="square-js">
    {TsQuickstartSchema}
  </CodeBlock>
</CodeGroup>

* `SourceField()`: This field will be embedded
* `VectorField()`: This stores the embeddings
* `model.ndims()`: Sets vector dimensions for your model
* In TypeScript, use `model.sourceField(...)` and `model.vectorField()` inside `LanceSchema(...)`

## Step 5: Create Table and Ingest Data

Create a table with your schema and add data:

<CodeGroup>
  ```python Python icon="python" theme={"theme":{"light":"vitesse-light","dark":"catppuccin-mocha"}}
  table = db.create_table("words", schema=Words)
  table.add([
      {"text": "hello world"},
      {"text": "goodbye world"}
  ])
  ```

  <CodeBlock filename="TypeScript" language="typescript" icon="square-js">
    {TsQuickstartCreateTable}
  </CodeBlock>
</CodeGroup>

The `table.add()` call automatically:

* Takes the text from each document
* Generates embeddings using your chosen model
* Stores both the original text and the vector embeddings

## Step 6: Query with Automatic Embedding

Note: On LanceDB Enterprise, the server does not generate embeddings from query text. In the Python remote
client, `table.search("greetings")` can still work when the table schema includes embedding metadata, because
the client computes the query embedding before sending the vector search. If there is no embedding metadata for
that search, `search("greetings")` in `auto` mode is treated as FTS instead.

Search your data using natural language queries:

<CodeGroup>
  ```python Python icon="python" theme={"theme":{"light":"vitesse-light","dark":"catppuccin-mocha"}}
  query = "greetings"
  actual = table.search(query).limit(1).to_pydantic(Words)[0]
  print(actual.text)
  ```

  <CodeBlock filename="TypeScript" language="typescript" icon="square-js">
    {TsQuickstartQuery}
  </CodeBlock>
</CodeGroup>

The search process:

1. Automatically converts your query text to embeddings
2. Finds the most similar vectors in your table
3. Returns the matching documents
