Regex not working with painless

I'm trying to write a regex query for a java log error as following

[Poolthread] com.xxxx.content.core-bundle com.xxxxx.content.model.impl.RegisterTypeInternal(3179)] The activate method has thrown an exception (com.xxxxx.content.model.exception.ModelException: ModelException: {Code}-LCC-REP-FCT-002, {Message}-Access denied)
com.xxxx.content.model.exception.ModelException: ModelException: {Code}-LCC-REP-FCT-002, {Message}-Access denied
	at com.xxxxx.content.repository.utils.ExceptionUtil.getException(ExceptionUtil.java:52)
	at com.xxxxx.content.repository.utils.ExceptionUtil.getException(ExceptionUtil.java:171)

i'm use this regex which gets an exception word before the character : (like as ModelException ) but the result is null . i tested this regex on the site regex101.com and it works fine.

([a-zA-Z0-9_]+)(?=:)

with a simple script field / runtime field who return the first group

def m = /([a-zA-Z0-9_]+)(?=:)/.matcher(doc['field.keyword'].value);
if ( m.find() ) {
   return m.group(0)
} else {
   return "no find"
}

I have no idea if am I missing something in the syntax .
Otherwise I am also looking for regex which allows to retrieve information on the java log error such as classname, url, file, exception, level,..

Thank you for your help

Hi @frost,
Can you post your mappings? You may have something like:

    "mappings" : {
      "properties" : {
        "field" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }

If so, the above entry would not be available as it's 561 characters. It's usually not a good idea to allow keyword fields above 256 characters to avoid bloating the index size.

However, you can use params['_source']['field'] to access the value from source. This works with both runtime fields and script fields.

def m = /([a-zA-Z0-9_]+)(?=:)/.matcher(params['_source']['field']);
if ( m.find() ) {
   return m.group(0) // emit(m.group(0)) for runtime fields
} else {
   return "no find"
}
1 Like

Welcome to our community! :smiley:

As an alternative approach, see if you can setup an ingest pipeline to handle the extraction of the values during the indexing process. It'll make querying a tonne easier.

1 Like

thanks for your response,
I apologize for my late response. I did not have the administrator level to manage the mapping.
I use with params['_source']['field'] but it'seems doesn't work.
but it seems to me that there is a limit of 256 characters. Indeed, I used other regex to extract the first word of a message and the result shows well.

So there will be a change in the data mapping level, I will come back to give you my answer if it works or not.

@stu
here the mapping

   ....
     "message" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 65536
            }
          }
        }
   .....

what's do you think ?

Hi @forst,
At this point I'm assuming there's something wrong elsewhere in your setup and there isn't enough info in this thread to diagnose where the problem lies.

Your script is fine and works when I tested it with the given data and correct mappings, it also worked when I tested it with params['_source']['field'].

I use with params['_source']['field'] but it'seems doesn't work.

What happens when you return that value rather than trying to run a regex against it? Please post the precise script, document and result.

what's do you think ?

Was that the original or updated mapping? If it's updated, then did you reindex the documents? The updated mapping only applies to indexed documents after the mapping takes place.

Hi @stu

here the result when i run the script

the mapping is original

What happens when you return params['_source']['field']? What happens when you Debug.explain(params['_source'])?

Sorry I'm new to kibana, how do I use this command

Debug.explain(params['_source'])

? (F12) ?
normally the data should show under 'first 10 results'
on vizualisze, nothing data show.