Elasticsearch job to end messages with beginning messages, merge, then reindex

The system I'm running is currently matching beginning messages with end messages then sending them to logstash for further processing.

This is done through a web of python code. I was wondering if this could be accomplished with native elastic.

Any suggestions?

It could be done with a multiline filter in Filebeat or Logstash to merge them.

So there's a catch that the messages don't show up at the same time.

Would it be possible to sit on to-be-matched documents in FB/LS until the complimentary message arrives?

Keeping messages until another message of the same kind arrives has its limitations:

  • resources for keeping the messages,
    • in memory is not failsafe
    • what if the corresponding end message never arrives? TTL?
  • messages must be routed to the same worker in order to match, this might introduce hot spot workers

You have to decide whether you are ok with such limitations.

An alternative to that is indexing the messages and building sessions out of it in a 2nd pass. Such a system can be realized with Transform. The transform can be setup as continuous transform, so it would session'ize your data as it comes in.

This approach is more scalable and fail safe, however there is no free lunch: You require more resources, e.g. you have a larger source index. The source index however could be managed by ILM, if you only care about the sessions you could use a rolling index that gets deleted e.g. every week.

We have an example for calculating the duration between 2 events, I think this is potentially useful for you.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.