All Stories

Vespa Newsletter, May 2023

Photo by Scott Graham on Unsplash

Vespa Newsletter, May 2023

Advances in Vespa features and performance include multi-vector HNSW Indexing, global-phase re-ranking, LangChain support, improved bfloat16 throughput, and new document feed/export features in the Vespa CLI.

High performance feeding with Vespa CLI

Photo by Shiro hatori on Unsplash

High performance feeding with Vespa CLI

Vespa CLI can now feed large sets of documents to Vespa efficiently.

Vespa support in langchain

Langchain now comes with a Vespa retriever.

Minimizing LLM Distraction with Cross-Encoder Re-Ranking

Announcing global-phase re-ranking support in Vespa, unlocking efficient re-ranking with precise cross-encoder models. Cross-encoder models minimize distraction in retrieval-augmented completions generated by Large Language Models.

Customizing Reusable Frozen ML-Embeddings with Vespa

Photo by fabio on Unsplash

Customizing Reusable Frozen ML-Embeddings with Vespa

Deep-learned embeddings are popular for search and recommendation use cases. This post introduces the concept of using reusable frozen embeddings and tailoring them with Vespa.

Revolutionizing Semantic Search with Multi-Vector HNSW Indexing in Vespa

Announcing multi-vector indexing support in Vespa, which allows you to index multiple vectors per document and retrieve documents by the closest vector in each document.

Private regional endpoints in Vespa Cloud

Photo by Taylor Vick on Unsplash

Private regional endpoints in Vespa Cloud

Set up private endpoint services on your Vespa Cloud application, and access them from your own VPC, in the same region, through the cloud provider's private network.

Vespa Newsletter, March 2023

Photo by Ilya Pavlov on Unsplash

Vespa Newsletter, March 2023

Advances in Vespa features and performance include GPU support, advanced BCP autoscaling, GCP Private Service Connect, and a great update to the e-commerce sample app.

GPU-accelerated ML inference in Vespa Cloud

Today we're introducing support for GPU-accelerated ONNX model inference in Vespa, together with support for GPU instances in Vespa Cloud!

Improving Search Ranking with Few-Shot Prompting of LLMs

Distilling the knowledge and power of generative Large Language Models (LLMs) with billions of parameters to ranking models with a few million parameters.

Vespa Newsletter, January 2023

Photo by Scott Graham on Unsplash

Vespa Newsletter, January 2023

Advances in Vespa features and performance include Better Tensor formats, AWS PrivateLink, Autoscaling, Data Plane Access Control and Container and Content Node Performance.

Improving Zero-Shot Ranking with Vespa Hybrid Search - part two

Where should you begin if you plan to implement search functionality but have not yet collected data from user interactions to train ranking models?