Introducing Private Embedding Models in Vespa Cloud

Image generated using OpenAI

We are thrilled to announce a new enhancement to Vespa Cloud: native support for private Hugging Face embedders!

This new capability enables developers to use proprietary or fine-tuned embedding models directly within Vespa, ensuring data privacy and maximizing relevance of search and recommendation in your applications.

Level-up with private models

While public models (such as those available out-of-the-box in Vespa Cloud) are great for most applications, some organizations require additional control over their embeddings. This is particularly the case in specialized business domains, where fine-tuned models open the door for a competitive edge in terms of both search- and recommendation-quality.

Private embedding models leverage our flexible component architecture, simplifying the integration process for developers and organizations:

Host your models securely: Upload and manage your private models in a secure, private Hugging Face model hub.
Grant Vespa access: Quickly and securely grant Vespa access to your models via the Vespa Console.
Configure your embedder: Set up your Hugging Face embedder component to use the new private model.

Integrating private models in 3 steps

To get started with private embedders in Vespa, first set up your private model hub on Hugging Face and upload a supported model. For information about the model types we support, please refer to the documentation.

Retrieve an API key from Hugging Face with the appropriate permissions and add it to the Vespa secret store via the Vespa Cloud Console:

screenshot

To allow components to use the secret, add it to your container configuration. Inside the `container` tag in `services.xml`, declare a secret using your vault and secret name:

<secrets>
    <myHuggingFaceSecret vault="my-vault" name="my-hugging-face-api-key"/>
</secrets>

Next, configure your embedder component to use the private model. In the embedder component configuration, add the URL to your private model in Hugging Face and reference the declared secret:

<component id="hf-embedder" type="hugging-face-embedder">
    <transformer-model
        url="https://huggingface.co/company/private-hub/tree/hash/model.onnx"
        secret-ref="myHuggingFaceSecret"
    />
    <tokenizer-model
        url="https://huggingface.co/company/private-hub/tree/hash/tokenizer.json"
        secret-ref="myHuggingFaceSecret"
    />
</component>

We recommend that you refer to a specific commit-hash in the URL rather than a branch name in order to have full control over which model version your cluster is using. For additional information about private models and how to use embedders in your application, refer to the embedder documentation.

We look forward to seeing how you will take advantage of private embedding models to build even more powerful AI applications with Vespa.ai!

Vespa Quickstart - How to build an Application with Vespa

Learn how to securely store and manage sensitive secrets in your Vespa Cloud applications using Vespa's built-in secret store.

« Building the Next-Gen Diligence Engine: Why 8byte is Partnering with Vespa.ai Exploring Hierarchical Navigable Small World »

Vespa Blog

Introducing Private Embedding Models in Vespa Cloud

Level-up with private models

Integrating private models in 3 steps

Read more

Vespa Quickstart - How to build an Application with Vespa

Get started with Vespa and set up your first application. Build your first Vespa instance using Python.

Shrinking Embeddings for Speed and Accuracy in AI Models

How MRL and BQL Make AI-Powered Representations Efficient

Securely Storing Secrets on Vespa Cloud

Learn how to securely store and manage sensitive secrets in your Vespa Cloud applications using Vespa's built-in secret store.