Help on multiline


(Greg) #1

Hi there,

I'm trying to translate my logstash configuration for using filebeat and the ingest pipeline feature.

While using logstash, I had the following configuration:
---- LOGSTASH -----
input:

codec => multiline {
     pattern => "%{SYSLOG5424SD}:%{DATESTAMP}].*"
     negate => "true"
     what => "previous" 

filter:

grok {
            match => { 'message' => '%{SYSLOG5424SD:thread}:%{DATESTAMP:datetime}] %{LOGLEVEL:level}\s{1,}\[id:%{GREEDYDATA:solid_rid}\] %{JAVACLASS:class}: %{GREEDYDATA:log_message}

---- FILEBEAT + INGEST NODE ----
Now, for filebeat, I'm using the following prospector (I tried to translate 1 to 1 the pattern from logstash)

 multiline.pattern: '\[.*?\]+:(?:(?:1[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]).(?:0?[1-9]|1[0-2]).(20)\d{2}\s*(?:2[0123]|[01]?[0-9]):(?:[0-5][0-9]):(?:(?:[0-5][0-9]|60)(?:[:.,][0-9]+)?)\].*'
  multiline.negate: true
 multiline.match: before

And the following pipeline has been configured:

"processors" : [
{
"grok": {
"field" : "message",
"patterns" : [
"%{SYSLOG5424SD:thread}:%{DATESTAMP:timestamp}\] %{LOGLEVEL:level}\s{1,}\[id:%{GREEDYDATA:solid_rid}\] %{JAVACLASS:class}: %{GREEDYDATA:log_message}"
]
},
{
"date" : {
"field" : "timestamp",
"formats" : ["dd.MM.YYYY HH:mm:ss.SSS"]
} } ]

But the "mutliline" logs are not treated into elasticsearch... I get the following error message
IllegalArgumentException: field [timestamp] not present as part of path [timestamp]

Which I assume comes from the "at ..." line that doesn't contains the [timestamp] field...

---- MULTILINE ERROR MESSAGE ----

[DispatchThread: [com.ibm.mq.jmqi.remote.impl.RemoteSession@555659779[connectionId=414D51434D51484F505141202020202008C5AB59265D7F01;conversationId=3]]]:01.01.2017 01:57:57.243] WARN [id:]
commons.service.mq.MQMessageListenerContainer: Execution of JMS message listener failed, and no ErrorHandler has been set.
org.springframework.jms.listener.adapter.ListenerExecutionFailedException: Listener method 'handleMessage' threw exception; nested exception is java.lang.NumberFormatException: null
at org.springframework.jms.listener.adapter.MessageListenerAdapter.invokeListenerMethod(MessageListenerAdapter.java:309)

Any help is welcome!
Thanks in advance,


(Steffen Siering) #2

With multiline one tries to capture some kind of structure, not actual contents. I guess the regular expression can be somewhat simplified. Can you share some sample logs for testing?


(Greg) #3

Hi Steffens,
Thanks for your prompt answer!

Sure, hereafter some additional log entries:

[DispatchThread: [com.ibm.mq.jmqi.remote.impl.RemoteSession@555659779[connectionId=414D51434D51484F505141202020202008C5AB59265D7F01;conversationId=3]]]:07.09.2017 23:52:22.663] WARN  [id:] .commons.service.mq.MQMessageListenerContainer: Execution of JMS message listener failed, and no ErrorHandler has been set.
org.springframework.jms.listener.adapter.ListenerExecutionFailedException: Listener method 'handleMessage' threw exception; nested exception is java.lang.NumberFormatException: null
        at org.springframework.jms.listener.adapter.MessageListenerAdapter.invokeListenerMethod(MessageListenerAdapter.java:309)
        at org.springframework.jms.listener.adapter.MessageListenerAdapter.onMessage(MessageListenerAdapter.java:230)
        at org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:678)
        at org.springframework.jms.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:638)
        at org.springframework.jms.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:608)
        at org.springframework.jms.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:579)
        at org.springframework.jms.listener.SimpleMessageListenerContainer.processMessage(SimpleMessageListenerContainer.java:329)
        at org.springframework.jms.listener.SimpleMessageListenerContainer$2.onMessage(SimpleMessageListenerContainer.java:305)
        at com.ibm.mq.jms.MQMessageConsumer$FacadeMessageListener.onMessage(MQMessageConsumer.java:331)
        at com.ibm.msg.client.jms.internal.JmsMessageConsumerImpl$JmsProviderMessageListener.onMessage(JmsMessageConsumerImpl.java:927)
        at com.ibm.msg.client.wmq.internal.WMQAsyncConsumerShadow.honourNoLocal(WMQAsyncConsumerShadow.java:631)
        at com.ibm.msg.client.wmq.internal.WMQAsyncConsumerShadow.consumer(WMQAsyncConsumerShadow.java:468)
        at com.ibm.mq.jmqi.remote.impl.RemoteProxyQueue.driveConsumer(RemoteProxyQueue.java:4239)
        at com.ibm.mq.jmqi.remote.impl.RemoteProxyQueue.deliverMsgs(RemoteProxyQueue.java:4834)
        at com.ibm.mq.jmqi.remote.impl.RemoteDispatchThread.deliverMsgsReconnectable(RemoteDispatchThread.java:578)
        at com.ibm.mq.jmqi.remote.impl.RemoteDispatchThread.deliverMsgs(RemoteDispatchThread.java:544)
        at com.ibm.mq.jmqi.remote.impl.RemoteDispatchThread.run(RemoteDispatchThread.java:280)
        at com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.runTask(WorkQueueItem.java:209)
        at com.ibm.msg.client.commonservices.workqueue.SimpleWorkQueueItem.runItem(SimpleWorkQueueItem.java:100)
        at com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.run(WorkQueueItem.java:224)
        at com.ibm.msg.client.commonservices.workqueue.WorkQueueManager.runWorkQueueItem(WorkQueueManager.java:298)
        at com.ibm.msg.client.commonservices.j2se.workqueue.WorkQueueManagerImplementation$ThreadPoolWorker.run(WorkQueueManagerImplementation.java:1214)
Caused by: java.lang.NumberFormatException: null
        at java.lang.Integer.parseInt(Integer.java:542)
        at java.lang.Integer.parseInt(Integer.java:615)
        at com.importer.cssearch.threads.UniverseTaggingThread.tagListingsNotPresentInCassandra(UniverseTaggingThread.java:768)
        at com.importer.cssearch.threads.UniverseTaggingThread.onSubscribedMessage(UniverseTaggingThread.java:1313)
        at com.importer.cssearch.threads.UniverseTaggingThread.onSubscribedMessage(UniverseTaggingThread.java:53)
        at commons.service.cassandra.TloUpdateMQSubscriberService.onSubscribedMessage(TloUpdateMQSubscriberService.java:100)
        at commons.service.cassandra.TloUpdateMQSubscriberService.onSubscribedMessage(TloUpdateMQSubscriberService.java:11)
        at commons.service.queue.MessageQueueSubscriberFactory$1.handleMessage(MessageQueueSubscriberFactory.java:66)
        at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:269)
        at org.springframework.jms.listener.adapter.MessageListenerAdapter.invokeListenerMethod(MessageListenerAdapter.java:301)
        .. 21 common frames omitted
[DispatchThread: [com.ibm.mq.jmqi.remote.impl.RemoteSession@555659779[connectionId=414D51434D51484F505141202020202008C5AB59265D7F01;conversationId=3]]]:07.09.2017 23:52:23.833] INFO  [id:] 
com. importer.cssearch.threads.UniverseTaggingThread: insertNewListingFromVdps() CS private listing has been created with listingId: 18638487,218,333 from VDPS
[DispatchThread: [com.ibm.mq.jmqi.remote.impl.RemoteSession@555659779[connectionId=414D51434D51484F505141202020202008C5AB59265D7F01;conversationId=3]]]:07.09.2017 23:52:27.956] INFO  [id:] 
com. importer.cssearch.threads.UniverseTaggingThread: insertNewListingFromVdps() CS private listing has been created with listingId: 13090717,192,484 from VDPS
[DispatchThread: [com.ibm.mq.jmqi.remote.impl.RemoteSession@555659779[connectionId=414D51434D51484F505141202020202008C5AB59265D7F01;conversationId=3]]]:07.09.2017 23:52:27.976] INFO  [id:] 
com. importer.cssearch.threads.UniverseTaggingThread: insertNewListingFromVdps() CS private listing has been created with listingId: 13090717,945,484 from VDPS
[AnswerReaderThread]:07.09.2017 23:52:37.518] INFO  [id:] dapi.accessLog: fewferf234r2.r2.ch - - [07/09/2017:11:52:37 +0200] "GET method=getCSSInfo&ctx=Type%3A%20API%2C%20ID%3A%20HealthChe
ck%2C%20optional_info%3A%20&rid=827&uid=CH402-cfewferfr01" 200 5
[elasticsearch[Dragoness][bulk_processor][T#1]]:07.09.2017 23:52:45.966] INFO  [id:] solid.elasticsearch.BulkProcessorBuilderImpl: beforeBulk(executionId:26533), numberOfActions: 11
[CSUniverseGenerationThread]:07.09.2017 23:53:27.792] INFO  [id:] com. importer.cssearch.threads.UniverseTaggingThread: stopCsfiTagging() - Stopping
[CSUniverseGenerationThread]:07.09.2017 23:53:27.792] INFO  [id:] .commons.service.mq.MQListenerConnectionHandler: stopListeningWithoutClosingConnection() - Stopping listenerContainer

Thanks
Greg


(Steffen Siering) #4

The most simple matching pattern I can come up with is to simply check for [ at the beginning of a line. Given the matching happens on regular expressions, just checking for a character at the beginning of a line will perform much better then the complicated pattern with nestings + or-expressions (which trigger backtracking).

  multiline:
    pattern: '^\['
    negate: true
    match: before

You can test the pattern and other logs in this simple tester script: https://play.golang.org/p/ppweRUwtRB

For the ingest node pipeline I don't know. You can use the simulate API to test the pipeline with some more sample logs. In addition you are well advised to install a fault handler which stores the error message and the original contents on errors. You can even change the _index field in the fault-handler, such that another ES index can be used as dead letter queue. Check out the document on fault handling. For example see the mysql filebeat module: https://github.com/elastic/beats/blob/master/filebeat/module/mysql/slowlog/ingest/pipeline.json#L33


(Greg) #5

Hi @steffens,

Thanks a lot, it works better, but still not as expected :wink:

Now I get as @timestamp the "ingest timestamp".
And the ingest timestamp is 2 hours behind, even if I specify timezone: "Europe/Amsterdam" in the pipeline.

Thanks
Greg


(Steffen Siering) #6

Filebeat does not parse any contents. That is, the @timestamp field is indeed the timestamp filebeat to acquire the first line in an event. I see you logstash filters use the keyword timestamp. Could it be you have 2 timestamps by chance? That is timestamp and @timestamp?

filebeat sends the @timestamp normalized to UTC-0. Starting with 6.0beta1 You can configure filebeat to also pass the timezone info.


(Greg) #7

Hi @steffens,
Thanks for the clarification!

I've now modified the timestamp to date to avoid the confusion. So in the ingest JSON, I've now

    "date" : {
       "field" : "date",
       "target_field" : "timestamp",
       "formats" : ["dd.MM.YYYY HH:mm:ss.SSS"]

But... the date field has now the string format and the timestampstill have the +2 hours (see above)
(date: 14.09.2017 08:40:29.760)
(@timestamp: September 14th 2017, 10:40:29.760)

Shall I update the my index (index-*) as follow?

curl -XPUT http://127.0.0.1:9200/index-*

 "mappings" : {
     "my_type": {
     "properties": {
       "date": {
           "type":   "date",
           "format": "dd.MM.YYYY HH:mm:ss.SSS"
       [...]

Thanks for your valuable help!
Greg


(Steffen Siering) #8

The date processor seems to parse a string into a date value. But if the mapping in the index says 'string', upon indexing will the value be converted into a string.

Normally users want to extract the actual events/log it's timestamp with grok only. That is, instead of having 2 fields named timestamp or date I would give them more descriptive names. Also consider the index name you want to write to, as the @timestamp field is normally used to set the timestamp. Plus most users (it's the default by kibana) are used to the @timestamp field being the actual events timestamp. For this reason I would store the extracted timestamp (which you named date) in the @timestamp field. The old @timestamp from filebeat is often not needed. If you decide you still need the timestamp from filebeat, I'd propose to give it a more descriptive name like read_timestamp.

Now you got me confused between Logstash, ingest node, timestamp and date :slight_smile:

Using your original configuration, the date processor will write the log it's timestamp into @timestamp. I guess part of the problem with parsing your timestamp is, the timestamp does not contain a timezone itself. At worst one can pre or post-process the timestamp using the script processor :slight_smile:

I'm not really sure about the date processor right now, though. I think you are better advised to test with the ingest node simulate API and ask in the Elasticsearch forums if you're still facing problems.


(Greg) #9

Thanks @steffens!
It's working!

The only problem I'm facing is that some multi line logs are not matched... despite there are working on http://grokconstructor.appspot.com/do/match

Log message:

[UniverseGenerationThread]:02.10.2017 00:02:12.560] ERROR [id:] dpb.commons.service.queue.MessageQueueSubscriber: startListening() - Exception in mqListenerConnectionHandler
org.springframework.jms.UncategorizedJmsException: Uncategorized exception occured during JMS processing; nested exception is com.ibm.msg.client.jms.DetailedJMSException: MQ0018: Die Verbindung zum WS-Manager 'MQH00' mit dem Verbindungsmodus 'Client' und dem Hostnamen '10.10.1.1(2121)' konnte nicht hergestellt werden.
Vergewissern Sie sich, dass der WS-Manager gestartet wurde, und falls er im Clientmodus ausgeführt wird, prüfen Sie, ob ein Empfangsprogramm aktiv ist. Weitere Informationen finden Sie in der verknüpften Ausnahmebedingung.; nested exception is com.ibm.mq.MQException: MQ1201: Der WebSphere MQ-Aufruf ist mit Beendigungscode '2' ('MQCC_FAILED') und Ursache '2537' ('MQRC_CHANNEL_NOT_AVAILABLE') fehlgeschlagen.
at org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:316)
at org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:169)
at org.springframework.jms.listener.AbstractJmsListeningContainer.start(AbstractJmsListeningContainer.java:273)
at dpb.commons.service.mq.MQListenerConnectionHandler.start(MQListenerConnectionHandler.java:47)
at dpb.commons.service.queue.MessageQueueSubscriber.startListening(MessageQueueSubscriber.java:50)
at dpb.commons.service.cassandra.TloUpdateMQSubscriberService.startListening(TloUpdateMQSubscriberService.java:77)
at com.importer.search.threads.UniverseTaggingThread.startCsfiTagging(UniverseTaggingThread.java:691)
at com.importer.search.threads.UniverseTaggingThread.run(UniverseTaggingThread.java:358)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.ibm.msg.client.jms.DetailedJMSException: MQ0018: Die Verbindung zum WS-Manager 'MQ0001' mit dem Verbindungsmodus 'Client' und dem Hostnamen '10.10.1.1(2121)' konnte nicht hergestellt werden.
at com.ibm.msg.client.wmq.common.internal.Reason.reasonToException(Reason.java:580)
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:216)
at com.ibm.msg.client.wmq.internal.WMQConnection.(WMQConnection.java:423)
at com.ibm.msg.client.wmq.factories.WMQConnectionFactory.createV7ProviderConnection(WMQConnectionFactory.java:6772)
at com.ibm.msg.client.wmq.factories.WMQConnectionFactory.createProviderConnection(WMQConnectionFactory.java:6151)
at com.ibm.msg.client.jms.admin.JmsConnectionFactoryImpl.createConnection(JmsConnectionFactoryImpl.java:280)
at com.ibm.mq.jms.MQConnectionFactory.createCommonConnection(MQConnectionFactory.java:6143)
at com.ibm.mq.jms.MQConnectionFactory.createConnection(MQConnectionFactory.java:6172)
at org.springframework.jms.support.JmsAccessor.createConnection(JmsAccessor.java:180)
at org.springframework.jms.listener.AbstractJmsListeningContainer.createSharedConnection(AbstractJmsListeningContainer.java:413)
at org.springframework.jms.listener.AbstractJmsListeningContainer.establishSharedConnection(AbstractJmsListeningContainer.java:381)
at org.springframework.jms.listener.AbstractJmsListeningContainer.doStart(AbstractJmsListeningContainer.java:285)
at org.springframework.jms.listener.SimpleMessageListenerContainer.doStart(SimpleMessageListenerContainer.java:209)
at org.springframework.jms.listener.AbstractJmsListeningContainer.start(AbstractJmsListeningContainer.java:270)
... 6 common frames omitted
Caused by: com.ibm.mq.MQException: MQ0001: Der WebSphere MQ-Aufruf ist mit Beendigungscode '2' ('MQCC_FAILED') und Ursache '2537' ('MQRC_CHANNEL_NOT_AVAILABLE') fehlgeschlagen.
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:204)
... 18 common frames omitted
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2537;AMQ9204: Verbindung zu Host '10.10.1.1(2121)' zurückgewiesen. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2537;AMQ1010: Der ferne Kanal 'XIS.I.ZH402.TT.rere' auf Host '' ist momentan nicht verfügbar. [3=XIS.I.ZH402.TT.rere]],3=10.10.1.1(2121),5=RemoteConnection.analyseErrorSegment]
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiConnect(RemoteFAP.java:2068)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiConnect(RemoteFAP.java:1343)
at com.ibm.msg.client.wmq.internal.WMQConnection.(WMQConnection.java:342)
... 17 common frames omitted
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2537;AMQ9558: Der ferne Kanal 'XIS.I.ZH402.TT.rere' auf Host '' ist momentan nicht verfügbar. [3=XIS.I.ZH402.TT.rer]
at com.ibm.mq.jmqi.remote.impl.RemoteConnection.analyseErrorSegment(RemoteConnection.java:3842)
at com.ibm.mq.jmqi.remote.impl.RemoteConnection.receiveTSH(RemoteConnection.java:2747)
at com.ibm.mq.jmqi.remote.impl.RemoteConnection.initSess(RemoteConnection.java:1022)
at com.ibm.mq.jmqi.remote.impl.RemoteConnection.connect(RemoteConnection.java:715)
at com.ibm.mq.jmqi.remote.impl.RemoteConnectionSpecification.getSessionFromNewConnection(RemoteConnectionSpecification.java:355)
at com.ibm.mq.jmqi.remote.impl.RemoteConnectionSpecification.getSession(RemoteConnectionSpecification.java:264)
at com.ibm.mq.jmqi.remote.impl.RemoteConnectionPool.getSession(RemoteConnectionPool.java:144)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiConnect(RemoteFAP.java:1690)
... 19 common frames omitted

Pattern:

%{SYSLOG5424SD:thread}:%{DATESTAMP:logdate}] %{LOGLEVEL:level}\s{1,}\[id:%{GREEDYDATA:solid_rid}\] %{GREEDYDATA:class}: %{GREEDYDATA:log_message}

Multiline:

multiline.pattern: '^\['
multiline.negate: true
multiline.match: before
  • Is the message size limited?
  • or might the characters like "ü" be a problem?

(Steffen Siering) #10

Logstash grok and ingest node grok do not use the same implementation. There might be subtle differences you are running into (grokconstructor is not supporting) :confused: Best try with the simulate API.

Filebeat by default captures a maximum of 10MB (max_bytes setting) and max 500 lines (multiline.max_lines setting) per event. Additional lines will be dropped.


(Greg) #11

Hi @steffens,
I've the issue. It was in my multiline pattern...

Can I specify a different "multiline.pattern" for each prospector?
I've tried but it doesn't seem to work...

Thanks
Greg


(Andrew Kroh) #12

Yes, the multiline settings are local to each prospector. (In fact there is no way to specify a single global multiline pattern.)

filebeat.prospectors:
- # Prospector 1
  paths: [/path/a]
  multiline.pattern: '^\['
  multiline.negate: true
  multiline.match: before
- # Prospector 2
  paths: [/path/b]
  multiline.pattern: '^[[:space:]]'
  multiline.negate: false
  multiline.match: after

(system) #13

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.