I'm trying to create automated tests for my Logstash pipeline config. I want to load multiple data sets from MySQL and for each data set, test that the raw data was transformed and filtered properly when loaded into Elasticsearch.
My plan is to use docker-compose to start containers for MySQL, Elasticsearch, Logstash, and my test runner. The biggest hurdle I have is triggering an import from the test runner, which is in its own container.
I can't run
logstash -f mypipeline.conf from the test runner because Logstash is in another container, and Logstash doesn't appear to have an API for triggering a "run." I can start Logstash with automatic config reloading enabled, have the test runner put the pipeline config in a shared volume, and wait for Logstash to reload, but that seems clunky. The only way I would know the data set was loaded is by checking the pipeline stats API for the number of events sent to Elasticsearch and comparing it to the number of records in my raw data set, or putting the Logstash logs in a shared volume and checking for the "Pipeline has terminated" message. This would also be difficult to set up and tear down from the test runner, which I would want to do between every set of tests.
Another possibility is to not put the test runner in a container at all, and instead to have it run docker-compose itself. Then it could start and stop containers to do set up and tear down between tests. After every test, the Logstash container could be restarted, which would automatically trigger an import. However, this still leaves the problem of knowing when an import is finished.
Is there a better way to do all this? logstash-test-runner is close to what I want, but it expects input from files, not a database.