Introducing Private Embedding Models in Vespa Cloud
Image generated using OpenAI
We are thrilled to announce a new enhancement to Vespa Cloud: native support for private Hugging Face embedders!
This new capability enables developers to use proprietary or fine-tuned embedding models directly within Vespa, ensuring data privacy and maximizing relevance of search and recommendation in your applications.
Level-up with private models
While public models (such as those available out-of-the-box in Vespa Cloud) are great for most applications, some organizations require additional control over their embeddings. This is particularly the case in specialized business domains, where fine-tuned models open the door for a competitive edge in terms of both search- and recommendation-quality.
Private embedding models leverage our flexible component architecture, simplifying the integration process for developers and organizations:
- Host your models securely: Upload and manage your private models in a secure, private Hugging Face model hub.
- Grant Vespa access: Quickly and securely grant Vespa access to your models via the Vespa Console.
- Configure your embedder: Set up your Hugging Face embedder component to use the new private model.
Integrating private models in 3 steps
To get started with private embedders in Vespa, first set up your private model hub on Hugging Face and upload a supported model. For information about the model types we support, please refer to the documentation.
Retrieve an API key from Hugging Face with the appropriate permissions and add it to the Vespa secret store via the Vespa Cloud Console:
To allow components to use the secret, add it to your container configuration. Inside the `container` tag in `services.xml`, declare a secret using your vault and secret name:
<secrets>
<myHuggingFaceSecret vault="my-vault" name="my-hugging-face-api-key"/>
</secrets>
Next, configure your embedder component to use the private model. In the embedder component configuration, add the URL to your private model in Hugging Face and reference the declared secret:
<component id="hf-embedder" type="hugging-face-embedder">
<transformer-model
url="https://huggingface.co/company/private-hub/tree/hash/model.onnx"
secret-ref="myHuggingFaceSecret"
/>
<tokenizer-model
url="https://huggingface.co/company/private-hub/tree/hash/tokenizer.json"
secret-ref="myHuggingFaceSecret"
/>
</component>
We recommend that you refer to a specific commit-hash in the URL rather than a branch name in order to have full control over which model version your cluster is using. For additional information about private models and how to use embedders in your application, refer to the embedder documentation.
We look forward to seeing how you will take advantage of private embedding models to build even more powerful AI applications with Vespa.ai!