Painless scripted fields with regex


(Tom M) #1

Hi,

ES/Kibana version: 6.2.1 + fluentd

I'm trying to workout why simple regex do not work
I have enabled script.painless.regex.enabled in elasticsearch.yml
I've followed https://www.elastic.co/blog/using-painless-kibana-scripted-fields on how to match a string and return that match.

My document field "message" of type text has been added as a fielddata: true for the mapping "fluentd". For some reason I couldn't add it to the mapping "doc".

"type" : "illegal_argument_exception",
"reason" : "Rejecting mapping update to [logstash-2018.02.26] as the final mapping would have more than 1 type: [doc, fluentd]"

My messages contain access logs for my API requests and has lots of GET/POST/ PUT messages which I can search without issues using a query in Kibana.

This is the script:

def m = /(GET|POST|PUT)/.matcher(fluentd['message'].value);
if ( m.matches() ) {
   return m.group(1)
} else {
   return "no match"
}

This returns 100% "no match".

Am I missing something ? Is it to do with the mixture of mappings doc/fluentd ?

Any tips appreciated.

Thanks!


(Ryan Ernst) #2

m.matches() will try to match the entire input to your pattern. Do you only mean to search for it? If so, use m.find(). See the docs for Matcher.


(Tom M) #3

Yes I mean to search but also return the match as down the track I want to be able to find & return the most common API v1 routes used in all requests. So I believe m.matches() is what I need to search on each message no ? Anyhow that isn't the main issue I believe as it still won't match anything.


(Ryan Ernst) #4

m.matches() will return false if the entirety of fluentd['message'].value is not one of GET, POST or PUT.

Sorry I did not notice the first exception you mentioned before. You should read about the removal of types. Starting with elasticsearch 6.0 you must have only a single type in your index. I don't know anything about fluentd, but you will need to change how documents from it are added to elasticsearch to use the same type as other documents.


(Tom M) #5

What I cannot understand is that even though all my messages in the index have _type=fluentd, the script fails when using:

fluentd['message'].value

which returns:

{"type":"illegal_argument_exception","reason":"Variable [fluentd] is not defined."}}}]},"status":500}


(Tom M) #6

I've tried the script on a simpler field such as doc['hostname'] and I just don't understand why the ES regex doesn't behave like normal regex. Online regex testers match but just not with ES, it returns strange things... I think I'm going to give up ELK due to it's complexity !


(Ryan Ernst) #7

Sorry, I copy/pasted that without reading it. Indeed, fluentd will never be defined. The variables that are available inside scripts are defined based on the context within elasticsearch they are used from. So doc is a map that gives access to values from the document that is currently being scored within a scoring script. ES regexes are not special in any way, they are regexes from java. See https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html.


(Tom M) #8

I worked out what I was doing wrong, I wasn't using keyword, so my script had to be like this

doc['message.keyword'].value

It would be good to see the painless script documentation updated with more details and mention the keyword.

Cheers,


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.