Multiline read in filebeat

Aditya_Srivastava · November 17, 2016, 10:46am

I have a custom log file which has multiple lines getting logged to it. All those multiple lines do not have any similar pattern to it. Those multilines are random. Which or rather what kind of parser or anything should I write in filebeat conf file. How do I read those multi lines??
Log file ex.

Nov 17 16:25:30 1.2.3.4 appData [app:16:25:28,115] INFO  [application level detail. Got response
Nov 17 16:25:30 1.2.3.4 appData [app:16:25:28,115] DEBUG [application-level detail. Response from app is :: {}029B<?xml version="1.0" encoding="UTF-8" ?>
                        
       <Engine>
                <Header>
                        <Version>1.0</Version>
                        <App>ABC</App>
                        <TargetApp>DEF</TargetApp>
                        <Count>161117162528</MsgId>
                        <TimeStamp>2016-11-17T16:25:32.313+05:30</TimeStamp>
                </Header>
                <Body>
                        <AuthRes>
                                <Msgcount>0810</Msgcount>
                                <countDate>20161117</countDate>
                        </AuthRes>
                </Body>
        </Engine>
Nov 17 16:25:30 1.2.3.4 app [App:16:25:28,116] INFO  [application level data]

andrewkroh · November 17, 2016, 1:23pm

I think you could you the timestamp to identify where new log lines start. See the Timestamp example in the documentation.

Aditya_Srivastava · November 17, 2016, 1:27pm

Are you talking about this timestamp?? If yes what about the other entries..How do I get all of them as one message???

andrewkroh · November 17, 2016, 1:30pm

No, I'm talking about the timestamp from the logger.

steffens · November 17, 2016, 1:32pm

Have a lookt at your logs. They are mostly plain-text starting with month, day of month and time. The multiline content is always indented. e.g. check a log-line starting with any of these characters ^[JFMASOND] (regex captures all characters of "months"). Alternative check for log line being empty or starting with space/tab.

Aditya_Srivastava · November 17, 2016, 2:14pm

Thnk u so much for the reply....Now I get the desired output all the multiline as one message... Only Prob now is all my new line is coming as \n and all the spaces are coming as \t.

ex-:
{"@timestamp":"2016-11-17T14:09:33.670Z","beat":{"hostname":"ip-10-0-0-9","name":"ip-10-0-0-9","version":"5.0.0"},"input_type":"log","message":"Nov 17 19:39:32 RCPPPCFWASN1 Wallet_App__access [WALLET:19:39:28,126] DEBUG [application-akka.actor.default-dispatcher-35959][SwitchPaymentResponseActor.java:45] Response from switch is :: {}029B\u003c?xml version="1.0" encoding="UTF-8" ?\u003e\n\t\u003cEngine\u003c/Header\u003e\n\t\t\u003cBody\u003e\n\t\t\t\u003cAuthRes\u003e\n\t\t\t\t\u003cMsgType\u003e0810\u003c/MsgType\u003e\n\t\t\t\t\u003cTranDate\u003e20161117\u003cmber\u003e\n\t\t\t\t\u003countDate\u003e20161117\u003\n\t\t\t\u003c/AuthRes\u003e\n\t\t\u003c/Body\u003e\n\t\u003c/Engine\u003e","offset":30024253,"source":"/res/1.2.3.4/App/2016-11-17.log",}

seti321 · November 17, 2016, 2:26pm

You could use @sematext/logagent instead of filebeat, logstash etc.
It is lightweight log shipper made with nodejs.

It can parse your file out of the box:

sudo npm i -g @sematext/logagent
cat test.log | logagent --yaml

@timestamp: Thu Nov 17 2016 15:21:47 GMT+0100 (CET)
message:    Nov 17 16:25:30 1.2.3.4 appData [app:16:25:28,115] INFO  [application level detail. Got response
logSource:  unknown

@timestamp: Thu Nov 17 2016 15:21:47 GMT+0100 (CET)
message: 
  """
    Nov 17 16:25:30 1.2.3.4 appData [app:16:25:28,115] DEBUG [application-level detail. Response from app is :: {}029B<?xml version="1.0" encoding="UTF-8" ?>
                            
           <Engine>
                    <Header>
                            <Version>1.0</Version>
                            <App>ABC</App>
                            <TargetApp>DEF</TargetApp>
                            <Count>161117162528</MsgId>
                            <TimeStamp>2016-11-17T16:25:32.313+05:30</TimeStamp>
                    </Header>
                    <Body>
                            <AuthRes>
                                    <Msgcount>0810</Msgcount>
                                    <countDate>20161117</countDate>
                            </AuthRes>
                    </Body>
            </Engine>
  """
logSource:  unknown

@timestamp: Thu Nov 17 2016 15:21:48 GMT+0100 (CET)
message:    Nov 17 16:25:30 1.2.3.4 app [App:16:25:28,116] INFO  [application level data]
logSource:  unknown

To ship logs to elasticsearch use

logagent -e http://localhost:9200 -i logs /var/log/*.log

More info here: https://sematext.com/logagent/

Aditya_Srivastava · November 17, 2016, 2:40pm

thnk u sir..will look into it....We also use logstash for filtering the data and parsing it.. Can that be done using the logagent????

seti321 · November 17, 2016, 2:46pm

Yes. You can define custom patters to structure logs, apply JavaScript functions or SQL queries to it before you ship the structured and aggregated logs to Elasticsearch.

Various log formats are supported out the box (nginx, system logs, mongodb, elasticsearch, hadoop, kafka ...).
Example pattern: http://sematext.github.io/logagent-js/parser/

Aditya_Srivastava · November 17, 2016, 2:53pm

how do we use kafka and logagent?? can we take the discussion to a new thread...

seti321 · November 17, 2016, 3:13pm

We could create a Kafka output plugin, but you might not need it, because Logagent has a disk buffer (used when connection to Elasticsearch fails), and retransmits logs when Elasticsearch is available again.

You can ask questions in Github https://github.com/sematext/logagent-js or the new forum: https://groups.google.com/forum/#!forum/logagent

Aditya_Srivastava · November 17, 2016, 3:33pm

Hii....I tried with the solution u gave . The pattern is working as expected when I try the pattern and content on the play golang. But When I try this on my file beat config it shows \n for newlines and \t for space. Please help me out on this.

seti321 · November 17, 2016, 4:03pm

Sorry, can't help with filebeat mixing up white spaces

Logagent pattern definition would look like this:

patterns:
 - # multiline with blockStart regex
  sourceName: !!js/regexp /mylogs/
  blockStart: !!js/regexp /^\S+\s\d+\s\d\d:\d\d:\d\d/
  match:
    - type: mylogs
      regex: !!js/regexp /^(\S+\s\d+\s\d\d:\d\d:\d\d)\s(\S+)\s(\S+)\s\[.+\]\s(\S+)\s([\S|\s]+)/
      fields:
        - ts
        - ip_address:string
        - app:string
        - severity:string
        - message:string
      dateFormat: MMM D HH:mm:ss

You could also extract fieds from the XML inside, with a bit more work on the regex.

andrewkroh · November 17, 2016, 6:24pm

This is normal. The data is put into a JSON object and since newlines and tab characters are special characters they need to be encoded. You'll also notice that quotes in the message are also escaped.

What problem is this causing you?

Aditya_Srivastava · November 18, 2016, 3:04pm

yeah u were right..The end message I get at ES is in proper format except for there is no new line. All message gets appended in same line with space. But still I can work with it. Thank

steffens · November 18, 2016, 3:42pm

makes me wonder how you display your message. You display the message via browser, e.g. are you using kibana? Newline characters are included in stored string (the json encoding uses \n for newlines). But when displaying in browser via HTML as is (without transforming \n to <br> or using <pre> tag), newlines will be ignored by HTML.

system · December 16, 2016, 3:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat multiline question Beats filebeat	7	1617	July 5, 2017
Need help with multiline pattern in complex log file Beats filebeat	3	361	November 26, 2018
Multiline Parsing Beats filebeat	2	607	October 24, 2017
Filebeat Multi line file read with Logstash Beats filebeat	1	307	June 12, 2020
Multiline troubleshooting Beats filebeat	3	784	November 25, 2016

Multiline read in filebeat

Related topics