Radu Gheorghe Follow
Software Engineer

09 Apr 2025

Quick Start with Logstash: from data to Vespa schema

If you want to get started with Vespa, check out our getting started guides. They are based on the sample apps, which provide good inspiration for your own use-cases.

But what if you already have some data that you want to write to Vespa?

This is where Logstash comes in. Its Output plugin for Vespa now has a detect_schema mode that can generate a Vespa application package from your data. The application package contains all the configuration required for Vespa to run: from the number of nodes and machine learning models to the schema.

In this tutorial, we’ll go through the fastest way to get your data into Vespa, whether you’re running Vespa locally (e.g., with Docker/Podman/etc) or using Vespa Cloud. Either way, the high-level steps are the same:

Download Logstash.
Install the Vespa Output plugin.
Configure Logstash to use the detect_schema mode.
Upload the generated application package to Vespa.
Disable detect_schema and re-run Logstash to write your data.

Let’s get into the specifics.

Logstash to local Vespa

The easiest way to get started is to download a zip/tgz archive from the Logstash download page. You can also install Logstash using your package manager or run it as a container.

Once it’s unpacked, install the Vespa Output plugin by running:

bin/logstash-plugin install logstash-output-vespa_feed

The config file will depend on your data. Have a look at this 5-recipe blog post for some inspiration. For now, let’s just read JSON documents from standard input, as an example.

# read JSON documents from standard input
input {
    stdin {
        codec => json
    }
}

# remove fields that are not part of our JSON documents
filter {
    mutate {
        remove_field => ["@timestamp", "@version", "event", "host", "log", "message"]
    }
}

output {
    # uncomment to print to stdout, for debugging
    # stdout {
    #     codec => rubydebug
    # }

    vespa_feed {
        # this will generate a Vespa application package, instead of feeding documents
        detect_schema => true
        # make Logstash deploy the application package to Vespa as well
        deploy_package => true
    }
}

Now, assuming Vespa is running locally with something like:

podman run --detach --name vespa-container --hostname vespa-container \  
  --publish 8080:8080 --publish 19071:19071 \  
  vespaengine/vespa

You can run Logstash and send a sample document to it:

echo '{"id": "1", "title": "Hello, world!"}' | bin/logstash -f config.conf

This will generate a Vespa application package and deploy it to your local container. At this point, you can disable detect_schema and re-run Logstash in exactly the same way to write your data to Vespa.

echo '{"id": "1", "title": "Hello, world!"}' | bin/logstash -f config.conf

Now you’re ready to profit (i.e., query):

curl -XPOST -H "Content-Type: application/json" -d\  
  '{  "yql": "select * from sources * where true"}'\  
   'http://localhost:8080/search/' | jq .

Once you’ve satisfied the initial thirst, you can go back to the deployed application package and iterate on it. The schema documentation and our IDE plugins should help you along the way.

To deploy a new iteration of the application package, you’ll need the Vespa CLI. With it, you can do:

# The --wait flag shows the deployment progress. Otherwise, you'll have to look in the logs.
vespa deploy --wait 900

Logstash prints the path to the generated application package when it deploys it. If you lost that output, Vespa CLI to the rescue:

vespa fetch /download/path

Speaking of the Vespa CLI, you’ll need it for Vespa Cloud as well.

Logstash to Vespa Cloud

With a Vespa Cloud account created, you’ll need to create a tenant and an application. Then, in your Logstash config, under the output section, add those details:

# the `input` and `filter` sections are the same as for local Vespa
output {
    vespa_feed {
        # Vespa Cloud details
        vespa_cloud_tenant => "your-tenant"
        vespa_cloud_application => "your-application"

        ### same options as for local Vespa

        # this will generate a Vespa application package, instead of feeding documents
        detect_schema => true
        # make Logstash deploy the application package to Vespa as well
        deploy_package => true
    }
}

When you run Logstash (with the same bin/logstash -f config.conf command as before), there are two differences. First is that Logstash will, by default, generate mTLS certificates and copy them to .vespa under your home directory. You can do this manually, too, by running vespa auth cert.

Secondly, the application package won’t be automatically deployed. Instead, you’ll see four commands to copy-paste:

Point Vespa CLI to Vespa Cloud: vespa config set target cloud
Point it to your tenant and application: vespa config set application YOUR_TENANT.YOUR_APPLICATION.default. Where “default” is the default instance name that you can change when you create the application. Adjust vespa_cloud_instance in the Logstash config if that’s the case.
Authenticate your Vespa CLI to your Vespa Cloud account: vespa auth login
Deploy the application package: vespa deploy --wait 900

Once you’ve deployed the application package, you can disable detect_schema and re-run Logstash in exactly the same way as for local Vespa. Logstash will automatically set the mTLS certificates to those of the generated application package. If you need to change them, modify the client_cert and client_key options in the Vespa output of your Logstash config. Check out the full list of options in the Logstash Output plugin for Vespa README.

Happy hacking! Oh, and feel free to reach out on LinkedIn, X or Slack if you have any questions!

Vespa with Logstash Recipes

Tutorials on feeding data to Vespa from CSV files, PostgreSQL, Kafka, Elasticsearch and another Vespa.

Free Trial

Deploy your application for free. Get started now to get $300 in free credits. No credit card required!

Elasticsearch vs Vespa Benchmark Report

Curious about how Vespa stacks up against Elasticsearch?

« Introducing Document Enrichment with Large Language Models in Vespa Transforming E‑Commerce with AI: Join Our Webinar to See How Vespa Delivers Smarter Search and Recommendations »

Vespa Blog