I know this has been discussed before, but I just wanted to make sure I had the correct up to date info and approach on this:-
Fairly new to Elasticsearch, moving up the learning curve
We have:- Filebeat -> Kafka -> Logstash -> Elasticserach/Kibana
I have a scenario where we want to measure the elapsed time between two asynchronous transactions, i.e. a BEGIN and END transaction in a log sent async with a unique id matching the BEGIN and END. The Elapsed Time Filter in Logstash initially worked perfectly for this, and produced an elapsed_time field for each matching pair. HOWEVER, I'm not able to continue this approach because it does not scale - We will be ramping up the ingestion rate such that multiple Logstash nodes will be required, and my understanding is that the Elapsed Time Filter will not work in this case, because the matching BEGIN and END transactions may go to different Logstash nodes.
Therefore, that being the case, I believe I need to perform the calculation at the Elasticsearch stage. The question is what is the best approach here? Note: the matching BEGIN and END pairs are in separate records/events within the Elasticsearch index. There is a unique id that matches the BEGIN transaction with the correct END transaction.
I would dearly like to use the Logstash Elapsed Time Filter, but can't see how I can with multiple Logstash nodes.
Any help or comments greatly appreciated