I am trying to query on my data set with composite query.
Here is my Query 1:
curl -X POST "localhost:9200/index1-202103/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {"bool": {"filter": [{ "range": { "date": {"gte": "20210330", "lte":"20210330"} }},{ "term": { "userid": "16114" }},{"exists": {"field": "opens"}},{"exists": {"field": "tags"}}]}},
"aggs" : {
"my_buckets" : {
"composite": {
"sources": [
{ "from_domain_wise": { "terms": {"field": "domain" } } },
{ "msp_wise": { "terms": {"field": "msp" } } },
{ "fromaddress_wise": { "terms": {"field": "fromaddress" } } },
{ "tag_wise": { "terms": {"field": "tags" } } },
{ "rate_over_time" : { "date_histogram" : { "field" : "opens.time", "interval" : "1h" } } }
]
}
}
}
}'
Query 2
curl -X POST "localhost:9200/index1-202103/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query": {"bool": {"filter": [{ "range": { "date": {"gte": "20210330", "lte":"20210330"} }},{ "term": { "userid": "16114" }},{"exists": {"field": "opens"}},{"exists": {"field": "tags"}}]}},
"aggs" : {
"my_buckets" : {
"composite": {
"sources": [
{ "from_domain_wise": { "terms": {"field": "domain" } } },
{ "msp_wise": { "terms": {"field": "msp" } } },
{ "fromaddress_wise": { "terms": {"field": "fromaddress" } } },
{ "tag_wise": { "terms": {"field": "tags" } } },
]
},
"aggs": { "rate_over_time" : { "date_histogram" : { "field" : "opens.time", "interval" : "1h" } } }
}
}
}'
Both results gives output for date histogram with different counts. When I checked, my findings were that Query1 is counting opens.time (FORMAT: 2021-03-30 15:15:45) fields duplicate values also whereas Query2 is counting opens.time only once if hour is same in single doc. opens.time store array of objects. Can anyone please explain why my query is behaving like this inspite of both the queries goal is same. I want result which Query2 gives and not Query1 results.