Painless scripted fields with regex

bluemalkin · February 26, 2018, 5:10am

Hi,

ES/Kibana version: 6.2.1 + fluentd

I'm trying to workout why simple regex do not work
I have enabled script.painless.regex.enabled in elasticsearch.yml
I've followed Using Painless in Kibana scripted fields | Elastic Blog on how to match a string and return that match.

My document field "message" of type text has been added as a fielddata: true for the mapping "fluentd". For some reason I couldn't add it to the mapping "doc".

"type" : "illegal_argument_exception",
"reason" : "Rejecting mapping update to [logstash-2018.02.26] as the final mapping would have more than 1 type: [doc, fluentd]"

My messages contain access logs for my API requests and has lots of GET/POST/ PUT messages which I can search without issues using a query in Kibana.

This is the script:

def m = /(GET|POST|PUT)/.matcher(fluentd['message'].value);
if ( m.matches() ) {
   return m.group(1)
} else {
   return "no match"
}

This returns 100% "no match".

Am I missing something ? Is it to do with the mixture of mappings doc/fluentd ?

Any tips appreciated.

Thanks!

rjernst · February 26, 2018, 5:44am

m.matches() will try to match the entire input to your pattern. Do you only mean to search for it? If so, use m.find(). See the docs for Matcher.

bluemalkin · February 26, 2018, 5:49am

Yes I mean to search but also return the match as down the track I want to be able to find & return the most common API v1 routes used in all requests. So I believe m.matches() is what I need to search on each message no ? Anyhow that isn't the main issue I believe as it still won't match anything.

rjernst · February 26, 2018, 6:42am

m.matches() will return false if the entirety of fluentd['message'].value is not one of GET, POST or PUT.

Sorry I did not notice the first exception you mentioned before. You should read about the removal of types. Starting with elasticsearch 6.0 you must have only a single type in your index. I don't know anything about fluentd, but you will need to change how documents from it are added to elasticsearch to use the same type as other documents.

bluemalkin · February 26, 2018, 11:05pm

What I cannot understand is that even though all my messages in the index have _type=fluentd, the script fails when using:

fluentd['message'].value

which returns:

{"type":"illegal_argument_exception","reason":"Variable [fluentd] is not defined."}}}]},"status":500}

bluemalkin · February 27, 2018, 12:01am

I've tried the script on a simpler field such as doc['hostname'] and I just don't understand why the ES regex doesn't behave like normal regex. Online regex testers match but just not with ES, it returns strange things... I think I'm going to give up ELK due to it's complexity !

rjernst · February 27, 2018, 8:50pm

Sorry, I copy/pasted that without reading it. Indeed, fluentd will never be defined. The variables that are available inside scripts are defined based on the context within elasticsearch they are used from. So doc is a map that gives access to values from the document that is currently being scored within a scoring script. ES regexes are not special in any way, they are regexes from java. See https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html.

bluemalkin · March 2, 2018, 2:32am

I worked out what I was doing wrong, I wasn't using keyword, so my script had to be like this

doc['message.keyword'].value

It would be good to see the painless script documentation updated with more details and mention the keyword.

Cheers,

system · March 30, 2018, 2:32am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Scripted fields - regex always return false Kibana	5	987	September 25, 2017
Scripted fields - matcher vs =~ different results Kibana painless	5	794	March 20, 2019
Kibana 5 \| apply regex on scripted fields Kibana	7	8353	January 10, 2017
REGEX-Painless returns null Kibana	8	845	May 18, 2018
Scripted Field / Painless script fails upon slash for string match Kibana painless	2	1079	January 13, 2020

Painless scripted fields with regex

Related topics