Filter duplicate record of log in kibana dashboard

Hi Friends,

I want to remove the duplicate record of log file which is displaying in kibana dashboard,

for example:
{TransactionID: 23750 , Status: failed, Time: 14/06/2020 11:20:25}
{TransactionID: 23750 , Status: Passed, Time: 14/06/2020 11:25:12}

In the above example, both the transaction id are same but the latest status is passed. So I want to show only the last processed record. Kindly help me on this. Thanks.

The solution depends upon how your data is structured and what question you wish to answer.

You could just filter based on status, or perhaps you could aggregate on transaction ID. It might also make sense to have a single document instead of two.

Hi @mattkime

Thanks for your reply, I may have n number of records in my log file, there may be a change to get many duplicate entries in my file. I want to display the total number of records i received for a day with out duplicate record. The unique column is transaction id. if there is any duplicate, i need to pick only the recent record based on the timestamp. Kindly guide me. Thanks in Advance.

Try using a top hits aggregation on the timestamp with a terms agg on the transaction ID. That should get it.

Hi @warkolm

I need to display the total transaction count in dashboard, if i use top hits aggregation on timestamp with term agg on transaction id, it is showing the last transaction received not the total count.

Does that mean transaction IDs are not unique?

@warkolm Transaction Id will be repeated in duplicate entries.

For example:

{TransactionID: 23751 , Status: Passed, Time: 14/06/2020 11:20:25}
{TransactionID: 23752 , Status: Passed, Time: 14/06/2020 11:25:12}
{TransactionID: 23753 , Status: failed, Time: 14/06/2020 11:30:25}
{TransactionID: 23753 , Status: Passed, Time: 14/06/2020 11:35:12}

I'm a little confused here.

You ask how to only show one record, the latest, then you talk about wanting to show the total count? Do you want both of those things or just one of them?

@warkolm I just want to display the total transaction which i received for a day, I can process the same transaction again if it gets failed, so Transaction id will not change for that but one entry will be added in the log file. so kibana taking that record also in the count. I hope now you understand my requirement clearly. Thanks.

Put another way - you want the latest transaction for each ID, then a total count all of those for the day?

@warkolm yes, exactly.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.