Advances in Vespa features and performance include multi-vector HNSW Indexing, global-phase re-ranking, LangChain support, improved bfloat16 throughput, and new document feed/export features in the Vespa CLI.
Announcing global-phase re-ranking support in Vespa, unlocking efficient re-ranking with precise cross-encoder models. Cross-encoder models minimize distraction in retrieval-augmented completions generated by Large Language Models.
Deep-learned embeddings are popular for search and recommendation use cases. This post introduces the concept of using reusable frozen embeddings and tailoring them with Vespa.
Announcing multi-vector indexing support in Vespa, which allows you to index multiple vectors per document and retrieve documents by the closest vector in each document.
Set up private endpoint services on your Vespa Cloud application, and access them from your own VPC, in the same region, through the cloud provider's private network.
Advances in Vespa features and performance include GPU support, advanced BCP autoscaling, GCP Private Service Connect, and a great update to the e-commerce sample app.
Distilling the knowledge and power of generative Large Language Models (LLMs) with billions of parameters to ranking models with a few million parameters.
Advances in Vespa features and performance include Better Tensor formats, AWS PrivateLink, Autoscaling, Data Plane Access Control and Container and Content Node Performance.