Skip to content

Benchmarking Guide

This is a small guide that will explain how you can use the tools in this library to run benchmarks.

As an example project we'll use the Sara demo.

First you'll need to install the project. An easy way to do this is via pip;

pip install git+https://github.com/RasaHQ/rasa-nlu-examples

You should now be able to run configuration files with NLU components from this library. You can glance over some examples below.

Here's a very basic configuartion file.

language: en
pipeline:
- name: WhitespaceTokenizer
- name: CountVectorsFeaturizer
  OOV_token: oov.txt
  analyzer: word
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
- name: DIETClassifier
  epochs: 200

Assuming this file is named basic-config.yml you can run this pipeline as a benchmark by running this command from the project directory;

rasa test nlu --config basic-config.yml \
          --cross-validation --runs 1 --folds 2 \
          --out gridresults/basic-config

This will generate output in the gridresults/basic-config folder.

Here's the same basic configuration but now with dense features added.

language: en
pipeline:
- name: WhitespaceTokenizer
- name: CountVectorsFeaturizer
  OOV_token: oov.txt
  analyzer: word
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
- name: rasa_nlu_examples.featurizers.dense.BytePairFeaturizer
  lang: en
  vs: 1000
  dim: 25
- name: DIETClassifier
  epochs: 200

Assuming this file is named basic-bytepair-config.yml you can run it as a benchmark by running this command from the project directory;

rasa test nlu --config basic-bytepair-config.yml \
          --cross-validation --runs 1 --folds 2 \
          --out gridresults/basic-bytepair-config

This will generate output in the gridresults/basic-bytepair-config folder.

We've now increased the vocabulary size and dimensionality.

language: en
pipeline:
- name: WhitespaceTokenizer
- name: CountVectorsFeaturizer
  OOV_token: oov.txt
  analyzer: word
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
- name: rasa_nlu_examples.featurizers.dense.BytePairFeaturizer
  lang: en
  vs: 10000
  dim: 100
- name: DIETClassifier
  epochs: 200

Assuming this file is named medium-bytepair-config.yml you can run it as a benchmark by running this command from the project directory;

rasa test nlu --config medium-bytepair-config.yml \
          --cross-validation --runs 1 --folds 2 \
          --out gridresults/medium-bytepair-config

This will generate output in the gridresults/medium-bytepair-config folder.

We've now grabbed the largest English Byte-Pair embeddings available.

language: en
pipeline:
- name: WhitespaceTokenizer
- name: CountVectorsFeaturizer
  OOV_token: oov.txt
  analyzer: word
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
- name: rasa_nlu_examples.featurizers.dense.BytePairFeaturizer
  lang: en
  vs: 200000
  dim: 300
- name: DIETClassifier
  epochs: 200

Assuming this file is named large-bytepair-config.yml you can run this benchmark by running this command from the project directory;

rasa test nlu --config large-bytepair-config.yml \
          --cross-validation --runs 1 --folds 2 \
          --out gridresults/large-bytepair-config

This will generate output in the gridresults/large-bytepair-config folder.

Final Reminder

We should remember that these tools are experimental in nature. We want this repository to be a place where folks can share their nlu components and experiment, but this also means that we don't want to suggest that these tools are state of the art. You always need to check if these tools work for your pipeline. The components that we host here may very well lag behind Rasa Open Source too.