Hi everyone,
Running ElasticSearch 6.3.0, Kibana and fluentd. We are trying to extract some information from a field called log:
"log": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
The log entries have the following format:
2019-02-19 23:20:36.633 [ipAddress:10.0.0.1 | method:GET | requestURI:/uri |
userId:anonymousUser | requestId:1235abcd] INFO 1 --- [testing] LoggingFilter :
Request duration (milliseconds): 470
Using painless, we have been trying to create a scripted field that extracts the request duration figure, however, we have not been able to come up with a solution so far.
After reading the docs, we tried accessing doc['log.keyword'].value, but surprisingly, this variable seems to be always null. Then we started using params._source.log and it kind works, in the sense that the log field seems to be in this variable, but when we try to filter it we get funny results. Here's a test painless script I did, as I was always getting a 'No match' using matcher.
def time = /^.*\(milliseconds\): ([0-9]+)$/.matcher(params._source.log);
def time2 = params._source.log =~ /^.*\(milliseconds\): ([0-9]+)$/;
return time.matches() + " vs " + time2;
// This returns false vs true for the same regexp
For the sake of testing, I tried passing the whole string as an argument and interestingly, this time it returns "true vs true":
def time = /^.*\(milliseconds\): ([0-9]+).*$/.matcher('2019-02-19 23:20:36.633 [ipAddress:10.0.0.1 | method:GET | requestURI:/uri | userId:anonymousUser | requestId:1235abcd] INFO 1 --- [testing] LoggingFilter : Request duration (milliseconds): 470');
def time2 = '2019-02-19 23:20:36.633 [ipAddress:10.0.0.1 | method:GET | requestURI:/uri | userId:anonymousUser | requestId:1235abcd] INFO 1 --- [testing] LoggingFilter : Request duration (milliseconds): 470' =~ /^.*\(milliseconds\): ([0-9]+).*$/;
return time.matches() + " vs " + time2;
Any advice on how to do this? Is the field type a problem for what we are trying to do? Why is doc['log.keyword'] always empty? Is params._source the right way of accessing the log field?
Thanks!