Now I have a problem when I want to calculate data base on the result of sorting.
Example, I have a data below:
{Id : 1, value : 2}
{Id :2 , value : 1}
and the return result is :
[ {id : 2, test1 : 3}
{id: 1, test1 : 2}]
As my knowledge, I think the script is run before sorting run, but I want to have the result is sorting by the Id first and after that, the script to calculate "test1" will run. How can I do that?
Below is the expected result :
[{id : 2, test1 : 1},
{id : 1, test1 : 3}]
Will that make any difference if sorting is done earlier or afterwards? Because id field is not modified at all by your script.
I don't think so we can move in that direction. There is a simple reason for that, Elasticsearch shards are distributed in nature. So if you have lets say 5 shards. These 5 shards can be distributed across multiple data nodes. Executing the script on data nodes makes it much more distributed as each data node only executes it over shards it holds. When you perform a sort operation, the documents are sorted on the data node first and also at the coordinating node after aggregation.
So if you want to run these scripts after sorting, you'll have to make the scripts run over coordinating nodes, which may not be a feasible option as it'll convert a potential parallel/distributed task into a serial one.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.