Hello. I found that in some cases composite aggregation returns old documents (the initial indexed version instead of updated). Steps to reproduce:
curl -X PUT -H 'Content-Type: application/json' localhost:9200/test -d '{
"settings": {
"index": {
"sort.field": "id"
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"name": {
"type": "text"
}
}
}
}'
curl -X PUT -H 'Content-Type: application/json' localhost:9200/test/_doc/1 -d '{
"id": 1,
"value": "Old Value"
}'
curl -X POST -H 'Content-Type: application/json' localhost:9200/test/_update/1 -d '{
"doc": {
"value": "New Value"
}
}'
curl -X GET -H 'Content-Type: application/json' localhost:9200/test/_search?pretty -d '{
"aggs": {
"composite": {
"composite": {
"size": 1,
"sources":
{ "id": { "terms": { "field": "id" } } }
,"after": {"id": "0"}
},
"aggs": {
"top_hits": {
"top_hits": {
"size": 1
}
}
}
}
}
}'
Response:
{
"took" : 102,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"id" : 1,
"value" : "New Value"
}
}
]
},
"aggregations" : {
"composite" : {
"after_key" : {
"id" : "1"
},
"buckets" : [
{
"key" : {
"id" : "1"
},
"doc_count" : 2,
"top_hits" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"id" : 1,
"value" : "Old Value"
}
}
]
}
}
}
]
}
}
}
As you can see the top level hits contain the actual values but the composite aggregation buckets contain old values.
This problem disappears when I remove the index sort setting from the mapping (which I think is important for performance if I want to use pagination) or the "after" property from the query.
Is there something I do wrong or is it a bug in Elasticcearch?
Elasticsearch version: 7.9.2