I'm trying to workout why simple regex do not work
I have enabled script.painless.regex.enabled in elasticsearch.yml
I've followed Using Painless in Kibana scripted fields | Elastic Blog on how to match a string and return that match.
My document field "message" of type text has been added as a fielddata: true for the mapping "fluentd". For some reason I couldn't add it to the mapping "doc".
"type" : "illegal_argument_exception",
"reason" : "Rejecting mapping update to [logstash-2018.02.26] as the final mapping would have more than 1 type: [doc, fluentd]"
My messages contain access logs for my API requests and has lots of GET/POST/ PUT messages which I can search without issues using a query in Kibana.
This is the script:
def m = /(GET|POST|PUT)/.matcher(fluentd['message'].value);
if ( m.matches() ) {
return m.group(1)
} else {
return "no match"
}
This returns 100% "no match".
Am I missing something ? Is it to do with the mixture of mappings doc/fluentd ?
Yes I mean to search but also return the match as down the track I want to be able to find & return the most common API v1 routes used in all requests. So I believe m.matches() is what I need to search on each message no ? Anyhow that isn't the main issue I believe as it still won't match anything.
m.matches() will return false if the entirety of fluentd['message'].value is not one of GET, POST or PUT.
Sorry I did not notice the first exception you mentioned before. You should read about the removal of types. Starting with elasticsearch 6.0 you must have only a single type in your index. I don't know anything about fluentd, but you will need to change how documents from it are added to elasticsearch to use the same type as other documents.
I've tried the script on a simpler field such as doc['hostname'] and I just don't understand why the ES regex doesn't behave like normal regex. Online regex testers match but just not with ES, it returns strange things... I think I'm going to give up ELK due to it's complexity !
Sorry, I copy/pasted that without reading it. Indeed, fluentd will never be defined. The variables that are available inside scripts are defined based on the context within elasticsearch they are used from. So doc is a map that gives access to values from the document that is currently being scored within a scoring script. ES regexes are not special in any way, they are regexes from java. See https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.