Martin Polden
Martin Polden
Principal Vespa Engineer

High performance feeding with Vespa CLI

High performance feeding with Vespa CLI

Photo by Shiro hatori on Unsplash

Photo by Shiro hatori on Unsplash

For a long time vespa-feed-client has been the best option for feeding large sets of documents to Vespa efficiently. While the client itself performs well, it depends on a Java runtime and its installation method is rather cumbersome. Compared to Vespa CLI it also lacks many ease-of-use features such as automatic configuration of authentication and endpoint discovery.

Since our initial announcement of Vespa CLI it has become the standard interface for working with Vespa applications, both for self-hosted installations and Vespa Cloud. However, document feeding with Vespa CLI was initially limited to single-document operations, using the vespa document command.

Having to juggle multiple tools while working with Vespa is obviously not ideal. We therefore decided to implement a high performance feeding client inside Vespa CLI, thus making it a universal client for Vespa.

Today we’re excited to announce this new feed client! See it in action in the screencast below:

Performance

The new feed client is ready for most use-cases. If you’re already using vespa-feed-client and want to switch to vespa feed, we recommend comparing the feed performance of your particular document set before making the switch. vespa feed outputs statistics on the same format as vespa-feed-client, making comparison easy.

We’ve invested a lot of time into making vespa feed as performant as the old client. In our performance tests, its current default configuration outperforms the old client when feeding small- (10B) and medium-sized (1KB) documents, but it still lags behind vespa-feed-client when feeding large (10KB+) documents.

Below you can see a throughput comparison (queries per second) of the two clients when feeding two million documents at sizes 10B, 1KB and 10KB:

We’ll continue making performance improvements to the new client, so make sure to keep your Vespa CLI installation up-to-date.

Future of the Java client

The introduction of vespa feed does not deprecate vespa-feed-client. If you’re already using vespa-feed-client there is no immediate need to migrate to the new client. vespa-feed-client provides both a Java library and a command-line interface for that library, both of which will remain supported.

However, if you’d rather use Vespa CLI for all things Vespa and don’t depend on vespa-feed-client as a Java library, we encourage you to try our new client.

Getting started

The new feed client is available in Vespa CLI as of version 8.164. See vespa help feed for usage and the Vespa documentation for further details.

If you’re using Homebrew you can upgrade to the latest version using brew upgrade vespa-cli or you can download the latest release from our GitHub releases page.

New to Vespa CLI? Please see our quick start guides for self-hosted Vespa or Vespa Cloud.

Found a bug or have a feature request? Feel free to file a GitHub issue. Need help with Vespa CLI or Vespa in general? Drop by our community Slack channel.