I want to identify if there are duplication using this 2 fields of doc_a. So, i tried
copyto:
I create a mapping and in both proptexta and proptextb put copy_to proptextab, and using the new field proptextab
I assumed proptextab would be a concatenation of proptexta and proptextb, then, I did a aggregation on proptextab and the result buckets are separated:
copy_to makes a multi-valued field rather than one who's value is the concatenation of the two values. Its
{
"key": ["123", "abc"]
}
rather than
{
"key": "123abc"
}
Transform could be used to make what you are looking for but I advise against it because its deprecated/dangerously difficult to debug. Your best bet is to do the concatenation in your application on the way into elasticsearch. That way you can get the _source back to verify that it worked properly.
I also wonder if I can use script for the fields in aggregation.
I am able to do this with my two set of docs( doc_a and doc_b) in one index, works fine with one field proptexta:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.