Vespa Product Updates, February 2020
In the January product update, we mentioned Tensor Operations, New Sizing Guides, matched-elements-only performance and Boolean query optimizations.
This month, we’re excited to share the following updates:
Ranking with LightGBM Models
Vespa now supports LightGBM machine learning models in addition to ONNX, Tensorflow and XGBoost. LightGBM is a gradient boosting framework that trains fast, has a small memory footprint and provides similar or improved accuracy to XGBoost. LightGBM also supports categorical features.
Matrix multiplication performance
Vespa now uses OpenBLAS for matrix multiplication, which improves performance in machine-learned models using matrix multiplication.
Benchmarking guide
Teams use Vespa to implement applications with strict latency requirements, with the minimal cost possible. In January we released a new sizing guide. This month, we’re adding a benchmarking guide that you can use to find the sweet spot between cost and performance.
Query builder
Thanks to contributions from yehzu, Vespa now has a fluent library for composing queries, see the client module for details.
Hadoop integration
Vespa is integrated with Hadoop and it is easy to feed from a grid. The grid integration now also supports conditional writes, see #12081.
About Vespa: Largely developed by Yahoo engineers, Vespa is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform. Thanks to feedback and contributions from the community, Vespa continues to grow.
We welcome your contributions and feedback (tweet or email) about any of these new features or future improvements you’d like to request.