Transform vs copy_to, text concatenation used for aggregation

I have similar question as this poster:

I have 2 fields:

  1. doc_a.prop.proptexta
  2. doc_a.prop.proptextb

I want to identify if there are duplication using this 2 fields of doc_a. So, i tried

copyto:
I create a mapping and in both proptexta and proptextb put copy_to proptextab, and using the new field proptextab
I assumed proptextab would be a concatenation of proptexta and proptextb, then, I did a aggregation on proptextab and the result buckets are separated:

say. proptexta = 123, proptextb= abc

in my aggregation result using proptextab.
I have

buckets: 
key: "123" 
doc_count:...
key: "abc"
doc_count:....

Which I was expecting

buckets:
key: 123abc
doc_count:...

So now.. I'm researching transform script in mapping...I am not sure it will work neither.

Continuing the discussion from Is it possible to use transform scripts in mappings to alter document _id?:

1 Like

copy_to makes a multi-valued field rather than one who's value is the concatenation of the two values. Its

{
  "key": ["123", "abc"]
}

rather than

{
  "key": "123abc"
}

Transform could be used to make what you are looking for but I advise against it because its deprecated/dangerously difficult to debug. Your best bet is to do the concatenation in your application on the way into elasticsearch. That way you can get the _source back to verify that it worked properly.

Thank you for the quick reply.

I also wonder if I can use script for the fields in aggregation.
I am able to do this with my two set of docs( doc_a and doc_b) in one index, works fine with one field proptexta:

"aggs"
    "MyAggResultComparingTwoDocNames": {
       "terms" : {
            "script": "doc[doc_a.prop.proptexta].values + doc[doc_b.prop.proptexta].values"

Is there a groovy script solution for the following?

(concatentate(doc_a.prop.proptexta,doc_a.prop.proptextb)).values + (concatentate(doc_b.prop.proptexta,doc_b.prop.proptextb)).values