Transform - List the unique values

Hi,

I'm using a Transform to group logs related to the same event.

An event can have multiple IP.

A simple aggregation like this:

"test_agg": {
  "terms": {
    "field": "vm_ip"
  }
},

Will list all the possible IP and count them. Later, the IP will be a field name and its count the value. Something like this:

test_agg: {
  XX.XX.XX.XX: 4,
  YY.YY.YY.YY: 7,
}

How can I list all the possible values in an array just like this:

my_agg: {
  [XX.XX.XX.XX, YY.YY.YY.YY]
}

Thanks a lot for your time!

There is no out of the box functionality that can do that as far as I know, but you can use scripting:

Option A: Write a scripted_metric aggregation to collect and write the results as flat list.

Option B: Use terms and re-map the data in an ingest pipeline with a script processor to drop the counts.

1 Like

Painless is like Java, so we can remove the duplicate like this:

"ips": {
  "scripted_metric": {

    "init_script": "state.docs = []",

    "map_script": """
       state.docs.add(doc['hostname'].value)
    """,

    "combine_script": "return state.docs;",

    "reduce_script": """
      def all_docs = [];
      for (s in states) {
        for (span in s) {
          all_docs.add(span);
        }
      }
      return all_docs.stream().distinct().collect(Collectors.toList());
    """
    }
 }

I don't know if it's efficient, but it does return the expected result.

If you use a set you can deduplicate on the fly:

    "ips": {
      "scripted_metric": {
        "init_script": "state.docs = new HashSet()",
        "map_script": """
       state.docs.add(doc['hostname'].value)
    """,
        "combine_script": "return state.docs;",
        "reduce_script": """
      def all_docs = new HashSet();
      for (s in states) {
        all_docs.addAll(s);
      }
      return all_docs;
    """
      }
    }
1 Like

Thanks a lot!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.