Elasticsearch Script Query - Returning multiple values on aggregation

Aggregation Result

"aggregations": {
      "649": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "ALLOGENE THERAPEUTICS",
               "doc_count": 43
            },
            {
               "key": "CELLECTIS",
               "doc_count": 5
            },
            {
               "key": "PFIZER",
               "doc_count": 4
            }
         ]
      }
   }

Merge Aggregation Result

"aggregations": {
      "649": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "STUB",   // ALLOGENE THERAPEUTICS + CELLECTIS
               "doc_count": 48
            },
            {
               "key": "PFIZER",
               "doc_count": 1
            }
         ]
      }
   }

I want to merge "ALLOGENE THERAPEUTICS" and "CELLECTIS"

And change key name to "STUB"

Therefore, I made the following query using script.

The script query language is groovy.

my_field, which is the aggregation target, is a list type.

{
  "size" : 0,
  "query" : {
    "ids" : {
      "types" : [ ],
      "values" : [ "id1", "id2", "id3", "id4" ... ]
    }
  },
  "aggregations" : {
    "649" : {
      "terms" : {
        "script" : {
          "inline" : 
              "def param = new groovy.json.JsonSlurper().parseText(
                  '{\"ALLOGENE THERAPEUTICS\": \"STUB\", \"CELLECTIS\": \"STUB\"}'
              ); 
              def data = doc['my_field'].values; 
              def list = [];
              if (!doc['my_field'].empty) {   // my_field is list type
                  for (x in data) { 
                      if (param[x] != null) { 
                          list.add(param[x]); 
                      } 
                  } 
              }; 
              if (list.isEmpty()) { 
                  return data;  // PFIZER
              } else { 
                  return list;  // list["STUB", "STUB"]
              }"
        },
        "size" : 50
      }
    }
  }
}

According to the results, 48 STUB should be printed, but 47 STUB are being printed.

"aggregations": {
      "649": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "STUB",   // ALLOGENE THERAPEUTICS + CELLECTIS
               "doc_count": 47  // It has to be 48 !!
            },
            {
               "key": "PFIZER",
               "doc_count": 1
            }
         ]
      }
   }

I've tried many things, but I think there's probably a problem with the list type.

I don't think I'm bringing all the elements.

I'd appreciate it if you could give me your opinion.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.