Hi
I am completely new to ELK and I just using for log analysis. I need to create a graph that give the top 10 slow web sites. I do know about visualization where we define the x and y axis and also about split line. The problem I am facing is my request url looks like
http;//mywebsite/somehtmlpage?q=asdlfkajs;lkdfasff
I need to aggregate it by "somehtmlpage" average response time but the query parameters seeming to be interfering.
Is there a way to ignore the query parameters and just parse the html pages at the time of search or do I have to some changes to logstash to filter out the query parameters?
You can do this with a painless scripted field, but there are performance limitations with scripted fields (which you can read about here: https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-painless.html#modules-scripting-painless-regex) so I strongly recommend you update your logstash instead.
However, if you want to go the scripted field route, I got it to work using a regex:
if (doc['url.keyword'].value !== null) {
Matcher m = /([a-zA-Z0-9_:.\/]*)\?/.matcher(doc['url.keyword'].value);
if(m.find()) {
return m.group(1);
}
}
return doc['url.keyword'].value;
You'll probably have to enable regex in your elasticsearch.yml as well:
script.painless.regex.enabled: true
But again, go the logstash route if that is an option!