How can Logstash runs be triggered from automated tests in a separate Docker container?

I'm trying to create automated tests for my Logstash pipeline config. I want to load multiple data sets from MySQL and for each data set, test that the raw data was transformed and filtered properly when loaded into Elasticsearch.

My plan is to use docker-compose to start containers for MySQL, Elasticsearch, Logstash, and my test runner. The biggest hurdle I have is triggering an import from the test runner, which is in its own container.

I can't run logstash -f mypipeline.conf from the test runner because Logstash is in another container, and Logstash doesn't appear to have an API for triggering a "run." I can start Logstash with automatic config reloading enabled, have the test runner put the pipeline config in a shared volume, and wait for Logstash to reload, but that seems clunky. The only way I would know the data set was loaded is by checking the pipeline stats API for the number of events sent to Elasticsearch and comparing it to the number of records in my raw data set, or putting the Logstash logs in a shared volume and checking for the "Pipeline has terminated" message. This would also be difficult to set up and tear down from the test runner, which I would want to do between every set of tests.

Another possibility is to not put the test runner in a container at all, and instead to have it run docker-compose itself. Then it could start and stop containers to do set up and tear down between tests. After every test, the Logstash container could be restarted, which would automatically trigger an import. However, this still leaves the problem of knowing when an import is finished.

Is there a better way to do all this? logstash-test-runner is close to what I want, but it expects input from files, not a database.

In the end, I took a slightly different approach than the two I mentioned in my question. I added a websocket server to my Logstash container. Every time my tests want to load a new data set, they send a message to the websocket server, which spawns a child process to run logstash -f mypipeline.conf. When the process exits, the websocket server sends a message back saying that the run is finished. You could also create a standard REST API instead of using websockets, but I didn't want to mess with polling.

There are a couple of problems with this approach. First, not all pipeline configurations will cause Logstash to exit when the pipeline terminates. It works for the JDBC input plugin, which I happen to be using, but not the file input plugin, for example. Second, starting a new Logstash process every time is slooow. It would be much better to use a persistent Logstash process.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.