Vespa at Berlin Buzzwords 2023
Jo Kristian Bergum presenting on using LLMs for training ranking models at Berlin Buzzwords 2023
Berlin Buzzwords 2023 has just finished and we thought it would be great to summarize the event. Berlin Buzzwords is Germany’s most exciting conference on storing, processing, streaming and searching large amounts of digital data, with a focus on open source software projects. This year, the conference was filled with exciting talks about Large Language Models (LLMs) and neural search techniques.
Boosting Ranking Performance with Minimal Supervision
Jo Kristian Bergum from the Vespa team gave a talk on Boosting Ranking Performance with Minimal Supervision.
Using generative Large Language Models (LLMs) to generate synthetic labeled data to train in-domain ranking models. Distilling the knowledge and power of generative LLMs into effective ranking models.
If you were interested in this talk, why don’t you check out some of our previous work on zero-shot ranking and adapting ranking models to new domains using LLMs:
- Improving Search Ranking with Few-Shot Prompting of LLMs
- Improving Zero-Shot Ranking with Vespa Hybrid Search
- Improving Zero-Shot Ranking with Vespa Hybrid Search - part two
In the context of ranking and retrieving context for LLMs we can also recommend:
The Debate Returns (with more vectors): Which Search Engine?
Jo Kristian Bergum from the Vespa team joined a panel of search engine and vector search experts to discuss and contrast search technologies.
Privacy-Preserving Web Search
Bar Camp at Berlin Buzzwords 2023
Berlin Buzzword’s Barcamp is an informal session with a schedule decided on the day. This session was not recorded.
Tom Gilke from Otto.de presenting at Berlin Buzzwords 2023.
We also recommend a talk on how the otto.de team migrated their infrastructure for powering search suggestions. They present their iterations moving from Elasticsearch to a simple python solution and in the end to Vespa in How we built the autosuggest infrastructure for otto.de.
Learning to hybrid search
We at the Vespa team have also worked with this large e-commerce ranking dataset in our blog series on Improving Product Search with Learning to Rank:
- Part one: introduction to ESCI product ranking dataset
- Part two: neural methods
- Part three: GBDT methods (hybrid)
Vectorize Your Open Source Search Engine
In this talk, Atita Arora gave a talk on vector search using bi-encoders that maps queries and documents into a latent embedding vector space and performs similarity search using nearest neighbor search.
Atita Arora from Open Source Connections presenting at Berlin Buzzwords 2023.
One key takeaway from the talk was a relevance evaluation breakdown by query type intent, where we clearly can see that vector search alone does not solve all search use cases.
The state of Neural Search and LLMs, interview with Jo Kristian Bergum - Berlin Buzzwords 2023
Jo Kristian Bergum from the Vespa team joined Founder and CEO Jakub Zavrel at Zeta Alpha to talk about the state of Neural Search and LLMs.
Hybrid search is buzzing
This year, the conference was filled with talks on hybrid search and we think it’s worthwile mentioning Lester Solbakken’s great talk from Berlin Buzzwords 2022 where he presented Hybrid search > sum of its parts?
Berlin Buzzwords is a highly regarded industry conference that brings together experts and professionals from various fields to discuss the latest trends and advancements in storage, processing, streaming, and search. One noticeable aspect of the 2023 edition was the significant emphasis on search-related topics, LLMs role in search, and neural hybrid search.