How to aggregate consecutive "same" logs in Elasticsearch/Kibana?

Hi. I currently have data looking like these.

// Input
{ user: "A", status: "away", timestamp: 1 }
{ user: "A", status: "away", timestamp: 2 }
{ user: "B", status: "active", timestamp: 3 }
{ user: "A", status: "away", timestamp: 4 }
{ user: "A", status: "active", timestamp: 5 }
{ user: "A", status: "active", timestamp: 6 }
{ user: "B", status: "active", timestamp: 7 }
{ user: "B", status: "away", timestamp: 8 }
{ user: "B", status: "away", timestamp: 9 }

And I want to aggregate them into data like these, picking up only the first status event.

// Output
{ user: "A", status: "away", timestamp: 1 }
{ user: "B", status: "active", timestamp: 3 }
{ user: "A", status: "active", timestamp: 5 }
{ user: "B", status: "away", timestamp: 8 }

As you can see in the first data set, the same status events appear multiple times. Here I'd like to visualize them with only the first status event instead of all. Perhaps I should use Logstash, but not at this time for some reason. Do you have any good ideas?

Welcome to our community! :smiley:

You should be able use a top hits aggregation to get that - Top hits aggregation | Elasticsearch Guide [7.15] | Elastic

1 Like

Thank you so much for the response. Let me have a little more clues.

  • Would you mind writing a rough JSON example for the query? I've read the top_hit doc through, and it seems good but not perfect since it targets data to be aggregated by field value. In contrast, my data can continue across days, months, or years and should be aggregated by "consecutiveness".
  • Is it possible for me to express the output in Kibana visualization? I've checked "Metrics" in Kibana can hold the top_hit aggregation, but still not sure exactly how to.

Would you please correct me if I'm wrong? I need to learn more :slight_smile:

@warkolm So glad if you can share your additional idea :pray:

bump

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.