ElasticSearch: DSL Query

Hi guys, I am opening this topic because I have a problem with a large amount of data (14M).
My dataset is composed as follows:

{"h":{"id":"AA001","process":"AK01","update-timestamp":1663665372171}}
{"h":{"id":"AA002","process":"AK01","update-timestamp":1663665372171}}
{"h":{"id":"AA003","process":"AK01","update-timestamp":1663665372171}}
{"h":{"id":"AA004","process":"AK01","update-timestamp":1663665372171}}
{"h":{"id":"AA001","process":"AK01","update-timestamp":1663665372172}}
{"h":{"id":"AA001","process":"AK01","update-timestamp":1663665372173}}

If my pipeline worked correctly, for each key (id and process) I should only have the most up-to-date update-timestamp.
So I'm trying to count the duplicate values: knowing how many update-timestamps are associated with the same id - process.

I tried to do it with kibana, with a datatable but the volumes are too high and it goes in error.

Could you help me with dsl?

Thanks in advance!
Salvo

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.