Group by RegExp

Hi,
I'd need to make query and group the results according to a reg Exp.
For example I have a filed with those possible values:
/sdc/user?id=4039&dc=4
/sdc/user?id=4039&dc=2
/sdc/user?id=2222&dc=7

should give me:
/sdc/user?id=4039 2
/sdc/user?id=2222 1

Is it possible ?

Thanks in advance

You could do it with a script/value_script on the terms aggregation, and use the regex functionality that Groovy scripting provides. It won't be super efficient... scripting is a fair amount slower, but it'll work.

A better approach is to try and extract some of that structure ahead of time. Either use an analyzer that breaks those strings into smaller components (then run a terms aggregation to count up the number of ?id=<num> tokens), or extract id, dc, etc query params into their own fields, which would allow you to run a terms agg on that directly.

Thanks for the suggestions,
I think the best way would be to extract the query params into fields, and then term agg.
I guess it should be the most efficient.

++ if you can extract them ahead of time, it'd definitely be a lot more efficient. And once they're extracted, you can use it for all kinds of unrelated analysis (top IDs, IDs over time, etc) :slight_smile: