Aggregation count on text arrays: count duplicates

Hello everyone,
I am counting occurences of URLs. I have a path variable which is an array of strings and indexed as such (see mapping below).
I am trying to get a count for each URL of how often it occured in total. For example:

doc1.path = [ url_1, url_2, url_1 ]
doc2.path = [ url_1, url_2 ]

The count should be as follows:

url_1: 3
url_2: 2

Instead, it is:

url_1: 2
url_2: 2

Apparently, duplicates are removed. I need to also count duplicates. I have searched the forums, but only found a lot of information on nested arrays of objects. Would anyone have any ideas how to visualize this in a table?

Current configuration

Mapping:

{
  "user_paths" : {
    "mappings" : {
      "path" : {
        "properties" : {
          "begin" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          { ... }
        }
      }
    }
  }
}

If any further information is needed, please feel free to ask for it.
I appreciate any help. thank you in advance.

Elasticsearch counts are the number of docs that contain a term at least once.
You're after the number of utterances of a term which is a different thing.

You could use a scripted metric aggregation to compute something like that

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.