All Stories
Photo by Scott Graham on Unsplash
Vespa Newsletter, May 2023
Advances in Vespa features and performance include multi-vector HNSW Indexing, global-phase re-ranking, LangChain support, improved bfloat16 throughput, and new document feed/export features in the Vespa CLI.
Photo by Shiro hatori on Unsplash
High performance feeding with Vespa CLI
Vespa CLI can now feed large sets of documents to Vespa efficiently.
Vespa support in langchain
Langchain now comes with a Vespa retriever.
Photo by Will van Wingerden on Unsplash
Minimizing LLM Distraction with Cross-Encoder Re-Ranking
Announcing global-phase re-ranking support in Vespa, unlocking efficient re-ranking with precise cross-encoder models. Cross-encoder models minimize distraction in retrieval-augmented completions generated by Large Language Models.
Customizing Reusable Frozen ML-Embeddings with Vespa
Deep-learned embeddings are popular for search and recommendation use cases. This post introduces the concept of using reusable frozen embeddings and tailoring them with Vespa.
Photo by Peter Herrmann on Unsplash
Revolutionizing Semantic Search with Multi-Vector HNSW Indexing in Vespa
Announcing multi-vector indexing support in Vespa, which allows you to index multiple vectors per document and retrieve documents by the closest vector in each document.
Photo by Taylor Vick on Unsplash
Private regional endpoints in Vespa Cloud
Set up private endpoint services on your Vespa Cloud application, and access them from your own VPC, in the same region, through the cloud provider's private network.
Photo by Ilya Pavlov on Unsplash
Vespa Newsletter, March 2023
Advances in Vespa features and performance include GPU support, advanced BCP autoscaling, GCP Private Service Connect, and a great update to the e-commerce sample app.
Photo by Sandro Katalina on Unsplash
GPU-accelerated ML inference in Vespa Cloud
Today we're introducing support for GPU-accelerated ONNX model inference in Vespa, together with support for GPU instances in Vespa Cloud!
Photo by Maxime VALCARCE on Unsplash
Improving Search Ranking with Few-Shot Prompting of LLMs
Distilling the knowledge and power of generative Large Language Models (LLMs) with billions of parameters to ranking models with a few million parameters.
Photo by Scott Graham on Unsplash
Vespa Newsletter, January 2023
Advances in Vespa features and performance include Better Tensor formats, AWS PrivateLink, Autoscaling, Data Plane Access Control and Container and Content Node Performance.
Photo by Tamarcus Brown on Unsplash