Terms query return distinct field results

Hi,
In terms query:
like this

{
    "query": {
        "terms" : { "user" : ["u1", "u2","u3"]}
    }
}

this would return any document's user match u1 or u2 or u3 and maybe duplicate
such as

result-1: u1
result-2: u2
result-3: u1
result-4: u3
result-5: u3
...

I want to the results hits count is same to query terms count.
such as:

result-1: u1
result-2: u2
result-3: u3

So, we want to query this

{
    "query": {
        "terms" : { "user" : ["u1", "u2","u3", ...... "u10000"]}
    }
}

and we want the search results like this:

result-1: u1
result-2: u2
......
result-10000: u-10000

What you're after are aggregations rather than individual documents. As your example shows - a user could produce many events that appear as documents

GET test/_search
{
  "query": {
	"terms": {
	  "user": [1, 5, 3]
	}
  },
  "size": 0,
  "aggs": {
	"topMatchingUsers": {
	  "terms": {
		"field": "user"
	  }
	}
  }
}

@Mark_Harwood
Thanks for your reply
but the aggregation is just buckets result, like this:

hits: {
duplicate hits result
}
aggs: {
 "buckets" : [
        {
          "key" : u1,
          "doc_count" : 29
        },
        {
          "key" : u2,
          "doc_count" : 9
        },
 ....
      ]
}

how to let the buckets return the total document hits, not just key, count pair value

Not sure I understand. Each bucket's doc_count is the total number of documents referencing that user?
If your question is about how to avoid the JSON formatting of results I'm afraid that's unavoidable.

@Mark_Harwood
I'll elaborate this case, I want the result is document result not document count:
such as
terms query user: [u1, u2]

{
"user": "u1",
"uid": "1000"
},
{
"user": "u1",
"uid": "1001"
},
{
"user": "u1",
"uid": "1002"
},
{
"user": "u2",
"uid": "2002"
},
{
"user": "u2",
"uid": "2003"
},

However I just want the distinct by user of the document results:

{
"user": "u1",
"uid": "1000"
},
{
"user": "u2",
"uid": "2003"
}

is okay,

{
"user": "u1",
"uid": "1001"
},
{
"user": "u2",
"uid": "2002"
}

is also okay for us.
we just want the distinct of terms query document results, not aggregation document counts

similar to this topic, we just want to return a unique list of documents in terms query

Maybe field collapsing is what you're looking for?

1 Like

Yes, I find collapse the field will return the distinct results, thanks!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.