There is an ElasticSearch 7.3.2 instance, with index containing "message" field of text type.
I need to get all docs with message length longer than 100000 chars (1million):
curl -XPOST -H 'Content-Type: application/json' -d '
{
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": "doc['message'].toString().length() > 10000,
"lang": "painless"
}
}
}
}
},
"script_fields": {
"A": { "script": { "lang": "painless", "source": "params._source.message.toString().length()" } },
"B": { "script": { "lang": "painless", "source": "doc['message'].toString().length()" } },
"C": { "script": { "lang": "painless", "source": "doc['message'].length" } },
"D": { "script": { "lang": "painless", "source": "doc['message'].size()" } }
}
}
' "127.0.0.1:9200/myindex/_search?pretty=true"
Result:
"fields" : {
"A" : [ 2155780 ],
"B" : [ 13206 ],
"D" : [ 1514 ],
"C" : [ 1514 ]
}
Why all they are different?
What should be used in query filter?
I need value from "A", but I cannot use it in query filter because it uses _source var that's unknown in filter script.