How to add field that is only present in some documents?

If your problem caused by that some documents miss the field to group_by and grouping such missing documents together is acceptable, one workaround could be to set ingest pipeline to fill such missing field by 'NULL' value.

PUT _ingest/pipeline/set_NULL
{
  "description": "set 'NULL' for missing fields",
  "processors": [
    {"set":{
      "field":"organization",
      "value": "NULL",
      "if":"!ctx.containsKey('organization')"}}
  ]
}

PUT /your_index/_settings
{
  "index": {
    "default_pipeline": "set_NULL"
  }
}

# apply ingest_pipeline to exiting documents.
POST your_index/_update_by_query
{
  "query":{
    "match_all": {}
  }
}

I think that sharing not only sample data that worked well, but also sample data that didn't exactly work well, and presenting what the desired output would be, will advance the discussion.

I made a sample scripted metric aggregation to pick up unique values as an array, something like named "unique values aggregation".
(This script is inspired from this post.)

"unique_organization":{
  "scripted_metric": {
   "init_script": "state.set = new HashSet()",
    "map_script": "if (params['_source'].containsKey(params.field)) {state.set.add(params['_source'][params.field])}",
    "combine_script": "return state.set",
    "reduce_script": "def ret = new HashSet(); for (s in states) {for (k in s) {ret.add(k);}} return ret",
    "params":{
      "field": "organization"
    }
  }
}
1 Like