I am running elasticsearch 6.3 and try to perform a terms aggregation with a script as source. In the script, I use doc values to access a list that contains duplicates. It seems like all duplicates are removed when I use doc['my_field1'] to access the field. Is it supposed to be like that?
Snippet from my document:
{
"my_field1": ["a", "a", "b", "b"],
"my_field2": ["c", "d", "e", "f"]
}
Snippet of my code:
"aggregations": {
"terms_agg1": {
"terms": {
"script": {
"source": "\n String returnString = '';\n for (int i = 0; i < doc['my_field1'].length; i++) { returnString += doc['my_field1'][i] + ';' + doc['my_field2'][i] + ')') } return returnString ",
"lang": "painless"
}
}
}
}
I expected that the entire content of my_field1 and my_field2 would be iterated in the script, but the former is read in as ["a", "b"] and hence my loop only iterates the lists two times.
As far as I can see, the doc_values documentation does not say anything about removing duplicates. I understand that the aggregation will eventually remove duplicates, but that should not apply to the source of the script, should it?
Any help would be greatly appreciated!