Watcher aggregation to nested field transformation

Hi all,
Have searched all over discuss on different categories. Some people raised similar question, but there is no clear answer. So, hope if we find a solution it will help many people.
Sometimes aggregation in query is used several times so we get nested buckets.
If we use action index we need to tranform data.
It is easy to make multiple docs with one bucket. Like:

"script": "return [ '_doc' : ctx.payload.aggregations.agg_name.buckets ]"
or
"script": "def docs = ; for(item in ctx.payload.aggregations.agg_name.buckets) { def document = [agg_name.key : agg_name.doc_count]; docs.add(document);} return ['_doc' : docs];"

But how to generate a correct mapping for nested field from sub bucket (i.e second bucket) to the doc for first bucket? Because we get an array.

My best was to get:

"script": "def docs = []; def subdocs = []; for(item in ctx.payload.aggregations.agg1.buckets) { for(subitem in item.agg2.buckets) { def subdocument = [subitem.key : subitem.doc_count]; subdocs.add(subdocument);} def document = [ 'a' : item.key, 'b' : item.doc_count, 'c' : subdocs ]; subdocs = []; docs.add(document);} return ['_doc' : docs];"

And I get 'c' field not nested but an array.

Can anyone give a clue?

1 Like

I'd be extremely helpful to also specify how the document now looks like and how you would like it to look like along with your script.

Your best way of debugging this is probably using a logging action with the above transform, as you can watch directly in the Execute watch API response, how your data structure looks like.

hope this helps!

--Alex

Alexander, thanks for interesting!
with a watcher like this:

{
  "trigger": {
    "schedule": {
      "interval": "30s"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          "log-ws-2*"
        ],
        "types": [],
        "body": {
          "size": 0,
          "query": {
            "range": {
              "@timestamp": {
                "gte": "now-30s"
              }
            }
          },
          "aggs": {
            "services": {
              "terms": {
                "field": "beat.name.keyword"
              },
              "aggs": {
                "methods": {
                  "significant_terms": {
                    "field": "m_method.keyword"
                  }
                }
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {
      "ctx.payload.hits.total": {
        "gt": 0
      }
    }
  },
  "actions": {
    "ws-aggs-test": {
      "transform": {
        "script": {
          "source": "def docs = []; def subdocs = []; for(item in ctx.payload.aggregations.services.buckets) { for(method in item.methods.buckets) { def subdocument = [method.key : method.doc_count]; subdocs.add(subdocument);} def document = [ 'service' : item.key, 'count_overal' : item.doc_count, 'methods' : subdocs ]; subdocs = []; docs.add(document);} return ['_doc' : docs];",
          "lang": "painless"
        }
      },
      "index": {
        "index": "log-ws-stat-2018.11",
        "doc_type": "doc",
        "execution_time_field": "@timestamp"
      }
    }
  }
}

I get resulting documents like:

{
  "_index": "log-ws-stats-2018.11",
  "_type": "doc",
  "_id": "1rdSVGcBX-k3fJp0qD58",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2018-11-27T08:39:22.170Z",
    "service": "SERVICE1",
    "methods": [
      {
        "method1": 136
      },
      {
        "method2": 135
      },
      {
        "method3": 147
      },
      {
        "method4": 146
      },
      {
        "method5": 126
      },
      {
        "method6": 123
      },
      {
        "method7": 77
      },
      {
        "method8": 76
      }
    ],
    "count_overal": 966
  },
  "fields": {
    "@timestamp": [
      "2018-11-27T08:39:22.170Z"
    ]
  },
  "sort": [
    1543307962170
  ]
}

I get "methods" field as an array. I'm trying to make it nested, because array fields cannot be indexed.
Hope that clarifies.

arrays can be indexed in just fine in elasticsearch.

If you are talking about nested in the mapping, you need to configure this first, but I am not sure what you mean.

Can you also show what kind of end document you would like to have?

Sure,
I want to make it like "beat" field:

"methods": {
        "method1": 136,
        "method2": 135,
        "method3": 147,
        "method4": 146,
        "method5": 126,
        "method6": 123,
        "method7": 77,
        "method8": 76,
    },
    "count_overal": 966

But I'm not sure if it is necessary.
The goal is to get all methods for service in one document.
Will try to work it around.

See this line

def subdocument = [method.key : method.doc_count]; subdocs.add(subdocument)

you are creating an own map, for each subdocument and add to a list. subdocs needs to be a map ([:]), and then you can do subdocs[method.key] = method.doc_count

Worked like a charm!
Thank you for help!

For everyone to save time. The resulting transformation is:

"transform": {
        "script": {
          "source": "def docs = []; def subdocs = [:]; for(item in ctx.payload.aggregations.services.buckets) { for(method in item.methods.buckets) { subdocs[method.key] = method.doc_count } def document = [ 'service' : item.key, 'count_overal' : item.doc_count, 'methods' : subdocs ]; subdocs = [:]; docs.add(document);} return ['_doc' : docs];",
          "lang": "painless"
        }
      }
2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.