Showing all keywords matching a certain pattern

Hi folks. I am not sure how to achieve the visualisation that I need from our web logs.

We have a bunch of URLs being logged, part of which contains a customer-identifying string.

/customerA/whatever.html
/customerA/something.html
/customerA/whatever.html
/customerB/something.html
/customerC/whatever.html

I am using a standard analyzer on the field, so I can get charts based on the tokens in there:

whatever.html    3
customerA        3
something.html   2
customerB        2
customerC        1

What I would like to get out is just the customer part as a count, where one of the other tokens is present (e.g. whatever.html)

customerA 2
customerC 1

Or e.g. something.html:

customerA 1
customerB 1

If I query on requestUrl it looks like it is using the whole field, not the split out parts.

I could work around this if there is a way of adding a field during ingress (these come from FileBeat into Elastic Cloud) based on a regex result, as the customer-identifying part is always matchable, in this example customer(.*)

@Cylindric would you mind sharing the relevant mappings for the fields you're trying to create the reports using?

Sure. Currently I have the following, but I can easily change it and recreate the indices if necessary:

    "requestUrl": {
      "index": "analyzed",
      "type": "string",
      "fielddata": true,
      "fields": {
        "raw": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    },

This would absolutely be the ideal way to handle this, but you can also use the advanced settings in the terms aggregation to specify a regular express that terms must match. In the below example your choose your field and set your include pattern to customer.*

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.