Tag metrics as testing

I use Metricbeat with Logstash to process and send to my Elasticsearch cluster. I do performance testing in some equipment and I have to write down the start and end time of the test so I can go to Kibana and look for the time frame. But I was thinking it would be better if somehow if I add some sort of trigger or tag when my test is active and by the time I need to see the results I only need to look for that specific tag instead of looking for the machine, the time frame and the metrics.

I tried multiple things like use the http input in Logstash to send a start signal via aggregate filter to Logstash and start tagging all metrics coming from Metricbeat and APM then send another curl with stop action with aggregate filter and stop tagging. But it proved to be difficult just because I can't know when aggregate is active unless I would know somehow the task_id, as I can only know that at the start and end of the test. I also thought on metadata, but that only persists for every event.
Any idea how to tackle it?

So there is a name/ID for each test that you would want to add as a tag to all events that arrive between the start and end signal? Am I getting this right? Could you give a little bit more details about the data you have and the result you'd like to achieve? E.g. are there metrics from multiple machines and the HTTP signal would say which ones are affected by the test? Could it ever happen that the tests overlap? And are we talking about a topic where the delay between the test start and the arrival of the signal in Logstash is neglectable?

Yes, there is a name/ID for each test, there are multiple machines, and because I'm measuring performance, each machine would perform a single test exclusively to try to avoid any interference. Metrics I'm doing are coming from metricbeat: Network usage, CPU, RAM memory, DISK I/O and filebeat.
Start of the test and the arrival of the signal in logstash can indeed be considered neglectable.

These are all automated tests scripts that run on any available free machine, each test is about 30-60 min long, after the end of every test it releases the machine in 15 min, so other test can be performed.

At the end of the day, I have to write down manually what test was done (the ID), when was it done (time frame), and then I go to kibana and retrieve the results and compare them. But it would save big time if this was done by tagging the metrics with the TestID.

Then you could use aggregate like you had planned and use the machine name as your task_id. Create the map on arrival of a start signal and save the test ID in it. Take the test ID from the map when a metrics event comes and end the map on arrival of the stop signal. Are there any objections to this?

Edit: an alternative would be to log the test ID, start, end and machine somewhere and update the documents by query after the end of the test. Then you wouldn't have to run your metrics data ingest with only one worker thread like you'd have to do with the aggregate filter.

That is indeed what I tried to do, take the machine name as the task id, but the same question raises, how would I know if a task is active?

  http{ port=>"5045"}
  beats{ port =>"5044"}
  if [headers][request_method] == "PUT" {
   if [action] =="Start"{
        task_id => "%{machineid}"
        map_action=> 'create'
   if [action] =="Stop"{
        task_id => "%{machineid}"
        map_action=> 'update'
        end_of_task => true

  if "metricbeat" in [@metadata][beat]{
    #How to check if the task is enabled??
    #if enabled tag it as testing true otherwise false.
    #or add_tag... etc
    hosts => ["elasticnode01:9200"]
     index => "%{[@metadata][beat]}-%{[@metadata][version]}"

I like the idea of log it separately, maybe I can use the output to write into a temporary file the start and end date, and then use an exec output to run the script to update the documents by query.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.