All Stories

Announcing vector streaming search: AI assistants at scale without breaking the bank

With personal data, you need complete results at low cost, something vector databases cannot provide. Vespa's new vector streaming search delivers complete results at a fraction of the cost.

Vespa at Berlin Buzzwords 2023

Summarizing Berlin Buzzwords 2023, Germany’s most exciting conference on storing, processing, streaming and searching large amounts of digital data.

Enhancing Vespa’s Embedding Management Capabilities

Photo by vnwayne fan on Unsplash

Enhancing Vespa’s Embedding Management Capabilities

We are thrilled to announce significant updates to Vespa’s support for inference with text embedding models that maps texts into vector representations.

Vespa Newsletter, May 2023

Photo by Scott Graham on Unsplash

Vespa Newsletter, May 2023

Advances in Vespa features and performance include multi-vector HNSW Indexing, global-phase re-ranking, LangChain support, improved bfloat16 throughput, and new document feed/export features in the Vespa CLI.

High performance feeding with Vespa CLI

Photo by Shiro hatori on Unsplash

High performance feeding with Vespa CLI

Vespa CLI can now feed large sets of documents to Vespa efficiently.

Vespa support in langchain

Langchain now comes with a Vespa retriever.

Minimizing LLM Distraction with Cross-Encoder Re-Ranking

Announcing global-phase re-ranking support in Vespa, unlocking efficient re-ranking with precise cross-encoder models. Cross-encoder models minimize distraction in retrieval-augmented completions generated by Large Language Models.

Customizing Reusable Frozen ML-Embeddings with Vespa

Photo by fabio on Unsplash

Customizing Reusable Frozen ML-Embeddings with Vespa

Deep-learned embeddings are popular for search and recommendation use cases. This post introduces the concept of using reusable frozen embeddings and tailoring them with Vespa.

Revolutionizing Semantic Search with Multi-Vector HNSW Indexing in Vespa

Announcing multi-vector indexing support in Vespa, which allows you to index multiple vectors per document and retrieve documents by the closest vector in each document.

Private regional endpoints in Vespa Cloud

Photo by Taylor Vick on Unsplash

Private regional endpoints in Vespa Cloud

Set up private endpoint services on your Vespa Cloud application, and access them from your own VPC, in the same region, through the cloud provider's private network.

Vespa Newsletter, March 2023

Photo by Ilya Pavlov on Unsplash

Vespa Newsletter, March 2023

Advances in Vespa features and performance include GPU support, advanced BCP autoscaling, GCP Private Service Connect, and a great update to the e-commerce sample app.

GPU-accelerated ML inference in Vespa Cloud

Today we're introducing support for GPU-accelerated ONNX model inference in Vespa, together with support for GPU instances in Vespa Cloud!