Hi,
I'd need to make query and group the results according to a reg Exp.
For example I have a filed with those possible values:
/sdc/user?id=4039&dc=4
/sdc/user?id=4039&dc=2
/sdc/user?id=2222&dc=7
should give me:
/sdc/user?id=4039 2
/sdc/user?id=2222 1
You could do it with a script/value_script on the terms aggregation, and use the regex functionality that Groovy scripting provides. It won't be super efficient... scripting is a fair amount slower, but it'll work.
A better approach is to try and extract some of that structure ahead of time. Either use an analyzer that breaks those strings into smaller components (then run a terms aggregation to count up the number of ?id=<num> tokens), or extract id, dc, etc query params into their own fields, which would allow you to run a terms agg on that directly.
Thanks for the suggestions,
I think the best way would be to extract the query params into fields, and then term agg.
I guess it should be the most efficient.
++ if you can extract them ahead of time, it'd definitely be a lot more efficient. And once they're extracted, you can use it for all kinds of unrelated analysis (top IDs, IDs over time, etc)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.