Vespa Newsletter, December 2023
It’s December, and still time to complete the Advent of Tensors challenge! This is a great way to get into tensors, how to use them - and be in the race to win Vespa swag!
2023 Vespa Open Source Survey
It’s been a great year both for search/recommendation as an industry, but also for Vespa! You are using Vespa in so many innovative ways we did not think of, which is awesome! To help us improve the feature set and hopefully remove some pain points, we kindly ask you to spend a few minutes on the 2023 Vespa Open Source Survey. Thanks in advance!
In the previous update, we mentioned Vespa Cloud Enclave, Lucene Linguistics integration, faster fuzzy matching, cluster-specific model-serving settings, and automated BM25 reconfiguration. Today, we’re excited to share the following updates:
Global-phase cross-hit normalization
Vespa provides two-phase ranking which lets you rerank hits using a ranking function that is too expensive to evaluate for all matches. Both these phases are executed locally on content nodes.
In 8.246 we introduced a third ranking phase - global-phase, which is executed on container nodes. As this operates on the global list of hits after merging all content nodes, it makes it possible to evaluate normalizing functions that need access to the global top list of hits. You can now add global-phase to your rank profile to evaluate any ranking function on container nodes, and here you also have access to the new normalizing functions normalize_linear, reciprocal_rank and reciprocal_rank_fusion. For example, use normalize_linear to normalize scores into a [0,1] range:
With this, it is easier to control each factor so it does not dominate too much - e.g., bm25 has an unbounded range.
reciprocal_rank is a useful function where the order of hits is relevant, but not necessarily the rank scores. Think of it as another normalization function with a [0,1] range, where only the rank information is preserved. Read more in cross-hit normalization including reciprocal rank fusion, and the blog post using reciprocal_rank_fusion.
New features in Pyvespa
Pyvespa has a new API for feeding collections of data, with better performance - see feed_iterable in 0.38 and 0.39 - example use. Please note that the previous batch feed functions were deprecated and subsequently removed from pyvespa.
Token Authentication for Data Plane Access
Vespa and Vespa Cloud support mTLS for security. Since November, we have added support for data plane Token Authentication in Vespa Cloud - see the announcement. You can use this to get API access without using certificates:
$ curl -H "Authorization: Bearer vespa_cloud_...." \
More new features
- You can now download the active application package using vespa fetch - see the cheat sheet. This makes it easier to get the active configuration to replicate application instances.
- unpack_bits(t) is a new function on ranking which unpacks bits from int8 input to 8 times as many floats. The innermost indexed dimension will expand to have 8 times as many cells, each with a float value of either 0.0 or 1.0 determined by one bit in the 8-bit input value. This function is comparable to numpy.unpackbits, which gives the same basic functionality. Since Vespa 8.256.
- You can now get the tokens indexed by Vespa returned with a query. This is helpful for debugging linguistics transformations, see this example. Since Vespa 8.243.
Blog posts since last newsletter
- Announcing our series A funding
- Yahoo Mail turns to Vespa to do RAG at scale
- Anonymized endpoints and token authentication in Vespa Cloud
- Changes in OS support for Vespa
- Hands-On RAG guide for personal data with Vespa and LLamaIndex
- Advent of Tensors 2023 🎅
- A new visual identity for a new era
- Turbocharge RAG with LangChain and Vespa Streaming Mode for Sharded Data