Vespa Newsletter, January 2025

In the previous update, we mentioned Elasticsearch vs Vespa Performance Comparison, Vision RAG, Binarizing vectors, and the Secret Store. Today, we’re excited to share the following updates:

Python query API
Vespa Logstash Connectors
ModernBERT models
Vespa CLI multi-get

And don’t forget to register for the Feb 13 AI Camp in San Francisco to learn how to Build a Visual RAG application!

Python query API

Vespa queries are expressed in YQL, like:

yql="select album,artist from music where artist contains 'coldplay'"

Or, build or modify the queries in a Searcher - Java example:

QueryTree tree = query.getModel().getQueryTree();
OrItem orItem = new OrItem();
orItem.addItem(tree.getRoot());
orItem.addItem(new WordItem("metal", "album"));
tree.setRoot(orItem);

With the new Querybuilder released in Pyvespa 0.52, Python developers can now also generate YQL. This makes it easier to create valid Vespa queries in code:

import vespa.querybuilder as qb
from vespa.querybuilder import QueryField

q = (
    qb.select(["album", "artist"])
    .from_("music")
    .where(
        QueryField("artist").contains("coldplay")
    )
)
resp = app.query(yql=q)
results = [hit["fields"] for hit in resp.hits]
df = pd.DataFrame(results)

The Querybuilder supports Vespa Grouping, too:

from vespa.querybuilder import Grouping as G

grouping = G.all(
    G.group("customer"),
    G.each(G.output(G.sum("price"))),
)
q = qb.select("*").from_("purchase").where(True).set_limit(0).groupby(grouping)
resp = app.query(yql=q)
group = resp.hits[0]["children"][0]["children"]
# get value and sum(price) into a DataFrame
df = pd.DataFrame([hit["fields"] | hit for hit in group])
df = df.loc[:, ["value", "sum(price)"]]

Find more examples in Using-the-Querybuilder-DSL-API and the complete reference at vespa-querybuilder.

Vespa Logstash Connectors

Getting data into and out of systems is often time-consuming and challenging. With Vespa’s Logstash plugins, this is much easier! See the blog post for how to import data from:

CSV file
Postgres database
Apache Kafka
Elasticsearch
Self-hosted Vespa to Vespa Cloud.

With this, migration projects should be more straightforward - let us know what other tools would help your migrations. We also recommend reading the documentation for cloning applications and data.

ModernBERT models

Since Vespa 8.470, the following modernBERT models are now supported:

The models are easily deployed on Vespa Cloud by including the models in services.xml:

<component id="my-embedder-id" type="hugging-face-embedder">
    <transformer-model model-id="nomic-ai-modernbert"/>
    <transformer-output>token_embeddings</transformer-output>
    <max-tokens>8192</max-tokens>
    <prepend>
        <query>search_query:</query>
        <document>search_document:</document>
    </prepend>
</component>

Vespa CLI multi-get

With #33071, one can get a list of documents:

$ vespa document get id:mynamespace:doc::doc-1 id:mynamespace:doc::doc-2

Install Vespa CLI 8.471 to use it. Thanks to @wix-mikej for submitting this!

New posts from our blog

Events

AICamp, San Francisco, February 13, Andreas Eriksen: Building a Visual RAG application with Vespa in Python
TreeHacks at Stanford, Feb 14-16

👉 Follow us on LinkedIn to stay in the loop on upcoming events, blog posts, and announcements.

Thanks for joining us in exploring the frontiers of AI with Vespa. Ready to take your projects to the next level? Deploy your application for free on Vespa Cloud today.

newsletter

« Vespa with Logstash Recipes AI in Insurance with Vespa.ai »

Vespa Blog