Vespa Newsletter, January 2025
In the previous update, we mentioned Elasticsearch vs Vespa Performance Comparison, Vision RAG, Binarizing vectors, and the Secret Store. Today, we’re excited to share the following updates:
- Python query API
- Vespa Logstash Connectors
- ModernBERT models
- Vespa CLI multi-get
And don’t forget to register for the Feb 13 AI Camp in San Francisco to learn how to Build a Visual RAG application!
Python query API
Vespa queries are expressed in YQL, like:
yql="select album,artist from music where artist contains 'coldplay'"
Or, build or modify the queries in a Searcher - Java example:
QueryTree tree = query.getModel().getQueryTree();
OrItem orItem = new OrItem();
orItem.addItem(tree.getRoot());
orItem.addItem(new WordItem("metal", "album"));
tree.setRoot(orItem);
With the new Querybuilder released in Pyvespa 0.52, Python developers can now also generate YQL. This makes it easier to create valid Vespa queries in code:
import vespa.querybuilder as qb
from vespa.querybuilder import QueryField
q = (
qb.select(["album", "artist"])
.from_("music")
.where(
QueryField("artist").contains("coldplay")
)
)
resp = app.query(yql=q)
results = [hit["fields"] for hit in resp.hits]
df = pd.DataFrame(results)
The Querybuilder supports Vespa Grouping, too:
from vespa.querybuilder import Grouping as G
grouping = G.all(
G.group("customer"),
G.each(G.output(G.sum("price"))),
)
q = qb.select("*").from_("purchase").where(True).set_limit(0).groupby(grouping)
resp = app.query(yql=q)
group = resp.hits[0]["children"][0]["children"]
# get value and sum(price) into a DataFrame
df = pd.DataFrame([hit["fields"] | hit for hit in group])
df = df.loc[:, ["value", "sum(price)"]]
Find more examples in Using-the-Querybuilder-DSL-API and the complete reference at vespa-querybuilder.
Vespa Logstash Connectors
Getting data into and out of systems is often time-consuming and challenging. With Vespa’s Logstash plugins, this is much easier! See the blog post for how to import data from:
- CSV file
- Postgres database
- Apache Kafka
- Elasticsearch
- Self-hosted Vespa to Vespa Cloud.
With this, migration projects should be more straightforward - let us know what other tools would help your migrations. We also recommend reading the documentation for cloning applications and data.
ModernBERT models
Since Vespa 8.470, the following modernBERT models are now supported:
The models are easily deployed on Vespa Cloud by including the models in services.xml:
<component id="my-embedder-id" type="hugging-face-embedder">
<transformer-model model-id="nomic-ai-modernbert"/>
<transformer-output>token_embeddings</transformer-output>
<max-tokens>8192</max-tokens>
<prepend>
<query>search_query:</query>
<document>search_document:</document>
</prepend>
</component>
Vespa CLI multi-get
With #33071, one can get a list of documents:
$ vespa document get id:mynamespace:doc::doc-1 id:mynamespace:doc::doc-2
Install Vespa CLI 8.471 to use it. Thanks to @wix-mikej for submitting this!
New posts from our blog
- Transforming the Future of Information Retrieval with ColPali
- Shrinking Embeddings for Speed and Accuracy in AI Models
- Architecture Inversion: Scale By Moving Computation, Not Data
- Vespa with Logstash Recipes
Events
- AICamp, San Francisco, February 13, Andreas Eriksen: Building a Visual RAG application with Vespa in Python
- TreeHacks at Stanford, Feb 14-16
👉 Follow us on LinkedIn to stay in the loop on upcoming events, blog posts, and announcements.
Thanks for joining us in exploring the frontiers of AI with Vespa. Ready to take your projects to the next level? Deploy your application for free on Vespa Cloud today.