Multiple line and fully unstructured log file

Hello my friends
I got some logfiles that are fully unstructured
what I am so far done is just for the first line of each log which are DATE HOUR TIMEZONE and some information

%{DATE_YMD:login_date} %{TIME:login_time} %{ISO8601_TIMEZONE:login_timezone} %{GREEDYDATA:unknow} (?<app_server>application-akka.*) (?m)%{GREEDYDATA:unknown2}
DATE_YMD is pattern dir --→ DATE_YMD %{YEAR}/%{MONTHNUM}/%{MONTHDAY}

sometimes my log contains information about client login which is my problem because it goes to another line.
The first question is how should I handle multiple line of a log and second one how to create a different pattern based on login(which can be success or error)

2016/08/05 15:03:40.445 +0430 - [INFO] - from application in

2013/08/05 19:03:40.445 +0430 - [INFO] - from application in
Login success - UserSession info : UserSession{userInfo=UserInfo{type='null', messageId='null', accountNumber='15391272736024 ', customerId='25814687', firstName='something', latinFirstName='null', lastNam='something', latinLastName='null', isCompany=false, bankAccounts=[BankAccount{id=-1, name='something', latinName='something'}], muserid='something', nationalCode='something', appuserId=null, email='something', onlineSessionTime=240, onlineSessionTimeMobile=5, allowSendSms=false, bourseAccountName='something'}, username='something', remoteAddress='something'}
SupervisorActor , forwardToApi method with apiKey =2224 , Class = class something

2015/02/05 19:03:46.445 +0430 - [INFO] - from application in
Login error : {"title":"problem","description":"password is wrong","errorType":"BAD_REQUEST","errorCode":4dddd0037,"UUID":"sdfsfsd66-jj-90"}

You can use a multiline codec to combine all the lines for one event

codec => multiline { pattern => '^[\d/]{10} [\d:\.]{12} \+\d{4} - ' negate => true what => "previous" auto_flush_interval => 1 }

Then use grok to grab the login status

grok { match => { "message" => [ "^Login %{WORD:loginStatus}" ] } }

Thanks alot..i will try this
could you explain what codec => multiline { pattern => '^[\d/]{10} [\d:.]{12} +\d{4} - ' negate => true what => "previous" auto_flush_interval => 1 } does?

That codec checks each line to see if it matches that regexp. The regexp matches the date at the beginning of some of the lines. If a line does not match (negate => true) then it is appended to the previous line (what => previous). So when it sees a line that starts with a date it starts collecting lines up to the next line that starts with a date and creates a single event from that set of lines.

An event is not flushed until the next line that starts with a date is seen, or the auto_flush_interval times out. If auto_flush_interval was not set then you would not get an event for the last message in the file, since it is not followed by anything that can flush the event.

Thanks alot..

Badger i forgot to tell you that iam using filebeat to send my log to logstash


  • type: log
    • /var/log/app.log
      hosts: ["localhost:5050"]
      so on other side logstash is using beats input plugin

input {
beats {
port => "5050"
filter {
grok {
match => { "message" => "%{DATE_YMD:login_date} %{TIME:login_time} %{ISO8601_TIMEZONE:login_timezone} %{GREEDYDATA} (?<app_server>application-akka.*)"}

output {
elasticsearch { stdout {}
As far as i red on elastic site using multi line codec while using file beats input plugin can cause i right???

If you have multiple beats sending data then yes, it would be a problem. You would have to adapt that codec configuration to the beat syntax for multiline.