Jon Bratseth
Jon Bratseth
Vespa Chief Architect

Text embedding made simple

Decorative image

"searching data using vector embeddings, unreal engine, high quality render, 4k, glossy, vivid colors, intricate detail" by Stable Diffusion

Text embedding made simple

Embeddings are the basis for modern semantic search and neural ranking, so the first step in developing such features is to convert your document and query text to embeddings.

Once you have the embeddings, Vespa.ai makes it easy to use them efficiently to find neighbors or evaluate machine-learned models, but you’ve had to create them either on the client side or by writing your own Java component. Now, we’re providing this building block out of the platform as well.

On Vespa 8.54.61 or higher, simply add this to your services.xml file under <container>:

<component id="bert" class="ai.vespa.embedding.BertBaseEmbedder" bundle="model-integration">
    <config name="embedding.bert-base-embedder">
        <transformerModel path="models/bert-embedder.onnx"/>
        <tokenizerVocab path="models/vocab.txt"/>
    </config>
</component>

The model files here can be any BERT style model and vocabulary, we recommend this one: huggingface.co/sentence-transformers/msmarco-MiniLM-L-6-v3.

With this deployed, you can automatically convert query text to an embedding by writing embed(bert, “my text”) where you would otherwise supply an embedding tensor. For example:

input.query(myEmbedding)=embed(bert, "Hello world")

And to create an embedding from a document field you can add

field myEmbedding type tensor(x[384]) {
    indexing: input myTextField | embed bert
}

to your schema outside the document block.

Semantic search sample application

To get you started we have created a complete and minimal sample application using this: simple-semantic-search.

Further reading

This should make it easy to get started with embeddings. If you want to dig deeper into the topic, be sure to check out this blog post series on using pretrained transformer models for search, and this on efficiency in combining vector search with filters.