Hello,
I am trying to parse the message from some logs and split the message in different fields.
I am using some custom made grok patterns that should apply to all the messages (tested at grok debugger). Nevertheless only some messages are parsed and the rest receive grokparsefailure tag. In addition, the messages that are parsed are usually the same kind of messages.
The obvious is that the grok patterns do not match the message but again I've tested it in grokdebug.herokuapp.com and the results are as expected. Also for the messages that the parsing works, the fields are created at the column on the left at the available fields section and the message is not replaced with the new fields at the source section.
Sample of the logs:
Parsed logs:
2019-02-12 10:28:10,155 ERROR [HiveServer2-Background-Pool: Thread-76246]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(193)) - AlreadyExistsException(message:Database continuous_deployment already exists)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:979)
at sun.reflect.GeneratedMethodAccessor74.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
at com.sun.proxy.$Proxy20.create_database(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:707)
at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:105)
at com.sun.proxy.$Proxy21.createDatabase(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:376)
at org.apache.hadoop.hive.ql.exec.DDLTask.createDatabase(DDLTask.java:3934)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:276)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:99)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2052)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1748)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1501)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1285)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1280)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)>
Not parsed logs:
2019-02-12 10:29:05,269 INFO [HiveServer2-Background-Pool: Thread-71289]: ql.Driver (SessionState.java:printInfo(1080)) - OK
2019-02-12 10:29:05,073 INFO [HiveServer2-Handler-Pool: Thread-89]: session.SessionState (SessionState.java:createPath(696)) - Created local directory: /tmp/hive/de49ae1c-8cb7-486b-b56d-f957735a07b9
Logstash conf file:
input {
beats {
port => 5045
client_inactivity_timeout => 0 }
}
filter {
grok {
patterns_dir => ["/etc/logstash/patterns"]
match => [ "message" , "%{DATE_EU:date} %{TIME:time} %{LOGLEVEL:loglevel} \[%{HANDLERINFO:handlerinfo}\: %{THREAD:thread}\]\: %{SESSION:session} \- %{HIVEMESSAGE:hivemessage}"]
overwrite =>[ "message" ]
}
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "hive-%{+YYYY.MM.dd}"
}
}
Patterns:
#CUSTOM PATTERNS FOR HIVE LOGS#
HANDLERINFO [a-zA-Z0-9._-]+
THREAD [a-zA-Z0-9._-]+
SESSION [a-zA-Z0-9.:()\s?_-]+
HIVEMESSAGE .*