You can use a transform for it. The group_by must be based on a common criteria, e.g. a common id.
For the combining step, you can use a scripted metric, e.g.:
"all_docs": {
"scripted_metric": {
"init_script": "state.docs = []",
"map_script": "state.docs.add(new HashMap(params['_source']))",
"combine_script": "return state.docs",
"reduce_script": "def docs = []; for (s in states) {for (d in s) { docs.add(d);}}return docs"
}
}
this would create an array with the original docs. If you want to merge them or merge only a specific field, you need to change the example accordingly, the docs contain some more ideas.
I am not aware of another solution and I think the painless solution should perform well. Painless is compiled into java byte code and therefore comparable in terms of speed with a pure java implementation.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.