Top_hits return multiple values whenever I declared the size = 1

Hi ,

I am using an ealstic query where I need to find all the running job in last 15 mins.
I captured the job data in a index. I just want to get the latest doc (single doc for each job). I tried to use top_hits but I am getting multiple value for a jobid. Could you pls help me.
Here is my query.

es.search(index='xyz-*',size=10000, body ={"aggs": { "group": { "terms": { "field": "Id.keyword" }, "aggs": { "group_docs": { "top_hits": { "size": 1, "sort": [ { "@timestamp": { "order": "desc" } } ] } } } } }, "query": { "bool": { "must": [ { "match_all": {} }, { "match_phrase": { "clustername": { "query": "abc" } } }, { "match_phrase": { "status.keyword": { "query": "RUNNING" } } },{ "range": { "@timestamp": { "gte": 1563428965639, "lte": 1563429865639, "format": "epoch_millis" } } }] }} })

That shouldn't happen with top_hits and size:1.
Can you share an example mapping, docs, search and result (with formatted JSON so it's readable).

In practice it may not give you what you're looking for anyway - presumably your query is filtering log records where status is "running" - you won't see the logged event docs where the same job has status "stopped" so you may be assuming a job is still running when it's not?

In these sorts of cases it can be simpler to update a single doc with the ID of the job and update the lastStatus and lastTime fields then do your aggregations based on that.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.