Maintaining the correct sequence of log events for the Logstash Elapsed Plugin


(Franz Allan Valencia See) #1

Hi,

We have a microservice architecture and we want to measure how long an event gets processed from MicroserviceA to MicroserviceB, from MicroserviceA to MicroserviceC, from MicroserviceA to MicroserviceXYZ, etc

Each of our microservice logs the event with a timestamp.

This then gets picked up by the filebeat of that microservice's server (each microservice is in its own server, and each has its own filebeat instance), and filebeat publishes to our kafka, which our logstash subscribes to.

We keep track the event processing from MicroserviceA to MicroserviceXYZ via a uuid that we set.

Given a particular uuid, we expect that its timestamp for MicroserviceA is lower than its timestamp for MicroserviceB, which in turn is lower than its timestamp for MicroserviceC, so on and so forth.

The same goes for its kafka offset. Given a particular uuid, MicroserviceA's event should have a lower offset than MicroserviceB's event, which in turn should have a lower offset that MicroserviceC's event, etc.

Based on our ELK though, those two does not seem to be the case. Any ideas?

From the screenshot above, based from the timestamp, looks like the sequence of events are as follows:

  1. MicroserviceA (December 7th 2018, 04:22:51.614)
  2. MicroserviceB (December 7th 2018, 04:22:51.690)
  3. MicroserviceG (December 7th 2018, 04:22:51.765)
  4. MicroserviceF (December 7th 2018, 04:22:51.770)
  5. MicroserviceC (December 7th 2018, 04:22:51.777)
  6. MicroserviceE (December 7th 2018, 04:22:51.809)

Also, based on the kafka offset, the sequence are as follows

  1. MicroserviceF (1301123)
  2. MicroserviceG (1301124)
  3. MicroserviceC (1301129)
  4. MicroserviceE (1301138)
  5. MicroserviceB (1301156)
  6. MicroserviceA (1301157)

Because the sequence of events are out of order, we are getting elapsed_end_without_start in our logstash elapsed plugin.

Any ideas on how to fix the sequencing for this?

Thanks,
Franz


(Franz Allan Valencia See) #2

FYI: This is how bad it looks like because of this issue. Majority of the time, I get elapsed_end_without_start and only a few times do I get elpased_match.