_grokparsefailure tag


#1

Hello,
I am trying to parse the message from some logs and split the message in different fields.
I am using some custom made grok patterns that should apply to all the messages (tested at grok debugger). Nevertheless only some messages are parsed and the rest receive grokparsefailure tag. In addition, the messages that are parsed are usually the same kind of messages.
The obvious is that the grok patterns do not match the message but again I've tested it in grokdebug.herokuapp.com and the results are as expected. Also for the messages that the parsing works, the fields are created at the column on the left at the available fields section and the message is not replaced with the new fields at the source section.

Sample of the logs:
Parsed logs:

2019-02-12 10:28:10,155 ERROR [HiveServer2-Background-Pool: Thread-76246]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(193)) - AlreadyExistsException(message:Database continuous_deployment already exists)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:979)
	at sun.reflect.GeneratedMethodAccessor74.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
	at com.sun.proxy.$Proxy20.create_database(Unknown Source)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:707)
	at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:105)
	at com.sun.proxy.$Proxy21.createDatabase(Unknown Source)
	at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:376)
	at org.apache.hadoop.hive.ql.exec.DDLTask.createDatabase(DDLTask.java:3934)
	at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:276)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:99)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2052)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1748)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1501)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1285)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1280)
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236)
	at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
	at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
	at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)>

Not parsed logs:

2019-02-12 10:29:05,269 INFO [HiveServer2-Background-Pool: Thread-71289]: ql.Driver (SessionState.java:printInfo(1080)) - OK

2019-02-12 10:29:05,073 INFO [HiveServer2-Handler-Pool: Thread-89]: session.SessionState (SessionState.java:createPath(696)) - Created local directory: /tmp/hive/de49ae1c-8cb7-486b-b56d-f957735a07b9

Logstash conf file:

 input {
       beats {
           port => 5045
           client_inactivity_timeout => 0  }
            }
    filter {
       grok { 
          patterns_dir => ["/etc/logstash/patterns"]
          match => [ "message" , "%{DATE_EU:date} %{TIME:time} %{LOGLEVEL:loglevel} \[%{HANDLERINFO:handlerinfo}\: %{THREAD:thread}\]\: %{SESSION:session} \- %{HIVEMESSAGE:hivemessage}"]
          overwrite =>[ "message" ]
            }
           }
output {
       elasticsearch {
                        hosts => "localhost:9200" 
                        index => "hive-%{+YYYY.MM.dd}"
                              }
            }

Patterns:

#CUSTOM PATTERNS FOR HIVE LOGS#

HANDLERINFO [a-zA-Z0-9._-]+
THREAD [a-zA-Z0-9._-]+
SESSION [a-zA-Z0-9.:()\s?_-]+
HIVEMESSAGE .*

#2

By the way all special characters : - are escaped. No idea why they are not escaped after I copy pasted my code.


#3

You need to use markdown to preserve the special characters, otherwise, for example, an underscore starts italics.

Either select your configuration and pattern in the edit pane and click on </> in the toolbar above the pane, or precede and follow the text you want to preserve with lines containing three backticks - ```


#4

Any help? Anyone sees what is wrong with my grok parsing? Because I am blind apparently.


#5

If I run with

input { generator { count => 1 message => '2019-02-12 10:29:05,073 INFO [HiveServer2-Handler-Pool: Thread-89]: session.SessionState (SessionState.java:createPath(696)) - Created local directory: /tmp/hive/de49ae1c-8cb7-486b-b56d-f957735a07b9' } }
filter {
    grok {
        match => [ "message" , "%{DATE_EU:date} %{TIME:time} %{LOGLEVEL:loglevel} \[%{HANDLERINFO:handlerinfo}\: %{THREAD:thread}\]\: %{SESSION:session} \- %{HIVEMESSAGE:hivemessage}"]
        overwrite =>[ "message" ]
        pattern_definitions => {
            "HANDLERINFO" => "[a-zA-Z0-9._-]+"
            "THREAD" => "[a-zA-Z0-9._-]+"
            "SESSION" => "[a-zA-Z0-9.:()\s?_-]+"
            "HIVEMESSAGE" => ".*"
        }
    }
}
output { stdout { codec => rubydebug { metadata => false } } }

then it matches successfully and I get

       "time" => "10:29:05,073",
   "loglevel" => "INFO",
       "date" => "19-02-12",
     "thread" => "Thread-89",
"hivemessage" => "Created local directory: /tmp/hive/de49ae1c-8cb7-486b-b56d-f957735a07b9",
    "session" => "session.SessionState (SessionState.java:createPath(696))",
"handlerinfo" => "HiveServer2-Handler-Pool",

#6

I know. grok debugger also says the patterns are correct, but in kibana i get grokparsefailure tag and the message is never split in the new fields for some of the logs. Not sure what is going on.