Logstash parsing and Watcher alerting

Hi everyone!

I have kind of these log messages:

Jul 23 09:24:16 mmr mmr-core[4268]: Processing file [Aweg3AOMTs_1563866656876839.mt], passing to thread [5147]
Jul 23 09:24:16 mmr mmr-core[5147]:  Aweg3AOMTs_1563866656876839.mt database [SELECT carrier,validity,now()>validity as now_valid FROM npdbcz WHERE `range` IN ('123456789','123456789','123456789','123456789','123456789','123456789') ORDER BY now_valid DESC,validity DESC]: result: () time: 1.32 ms 
Jul 23 09:24:16 mmr mmr-core[5147]:  Aweg3AOMTs_1563866656876839.mt NPDB::query(111=npdbcz,123456789): no match [2.05 ms]
Jul 23 09:24:16 mmr mmr-core[5147]:  Aweg3AOMTs_1563866656876839.mt GetDestination(2,+31112223344,1,1,0): MATCH:205 "PR+420603" (0.031ms prefix match, 2.261ms NPDB)
Jul 23 09:24:16 mmr mmr-core[5147]:  Aweg3AOMTs_1563866656876839.mt GetProvider_v4(3,0,10,+31112223344,+31112223344,-TD------,1): "RT3/DT2/D205(PR+420603):tosoko.EMIDirect(W50)", total time 2.859ms (0.031ms prefix match, 2.261ms NPDB, 0.063ms route match)
Jul 23 09:24:16 mmr mmr-core[5147]:  Aweg3AOMTs_1563866656876839.mt Provider found: [tosoko], Connector found: [tosoko.EMIDirect] "RT3/DT2/D205(PR+420603):tosoko.EMIDirect(W50)"
Jul 23 09:24:16 mmr mmr-core[5147]:  Aweg3AOMTs_1563866656876839.mt MT_Source (+31112223344) sources_regexp_match 1 alphanumlist_match 
Jul 23 09:24:16 mmr mmr-core[5147]:  Aweg3AOMTs_1563866656876839.mt MT Routing: s10 [Aweg3AOMTs], bg [], Source [+31112223344], Destination [+31112223344] Capabilities [-TD------], prio3, #1: DestID [205], Provider [tosoko], Connector [tosoko.EMIDirect], trace [RT3/DT2/D205(PR+420603):tosoko.EMIDirect(W50)]
Jul 23 09:24:16 mmr mmr-core[5147]:  Aweg3AOMTs_1563866656876839.mt Writing file into [/var/mmr/queues/connectors/tosoko.EMIDirect/mt/3] (prio: per-message priority: 3)
Jul 23 09:24:16 mmr mmr-core[5147]:  Aweg3AOMTs_1563866656876839.mt processMTMessage() #12798 realtime: 5.684 ms
Jul 23 09:24:18 mmr mmr-core[4268]: Processing file [Aweg3AOMTs_1563866656876839.0.dn], passing to thread [5210]
Jul 23 09:24:18 mmr mmr-core[5210]:  Aweg3AOMTs_1563866656876839.0.dn DN: MessageID:[Aweg3AOMTs_1563866656876839] RefID:[00420603426965:230719092416], Source:[+31112223344], Destination:[+31112223344], StatusCode:[0], StatusText:[Message delivered], Timestamp:[20190723092418]
Jul 23 09:24:18 mmr mmr-core[5210]:  Aweg3AOMTs_1563866656876839.0.dn Forwarding DN via SMPPAwegInternal, moving to [/var/mmr/queues/SMPPAwegInternal]
Jul 23 09:24:18 mmr mmr-core[5210]:  Aweg3AOMTs_1563866656876839.0.dn processDN() #7750 realtime: 1.382 ms

First of all I need to parse the rows in logstash but don't really know how to manage parsing of different types of logs. I have experience only with the same rows of logs...

My second problem is that I want to trigger Wather alert based on timestamp information so that log which arrives after some amount of time will be spotted and alerted.

For example:

Jul 23 09:24:16 mmr mmr-core[5147]:  Aweg3AOMTs_1563866656876839.mt processMTMessage() #12798 realtime: 5.684 ms

This message has id "Aweg3AOMTs_1563866656876839" and arrived at Jul 23 09:24:16

On the other hand:

Jul 23 09:24:18 mmr mmr-core[5210]:  Aweg3AOMTs_1563866656876839.0.dn processDN() #7750 realtime: 1.382 ms

This message with the same id: Aweg3AOMTs_1563866656876839 arrived at Jul 23 09:24:18 so alert would be triggered if second message arrived for example 10 seconds later than first one.

If anyone has any experience with similar problem, please help :slight_smile:

Thank you in advance!

I would start with

    dissect { mapping => { "message" => "%{[@metadata][ts]->} %{+[@metadata][ts]} %{+[@metadata][ts]} %{f1} %{f2}[%{f3}]: %{restOfLine}" } }
    date { match => [ "[@metadata][ts]", "MMM dd HH:mm:ss" ] }

If you do not want to save the values of f1/f2/f3 you can use %{} to discard the contents of the matched field.

You can match restOfLine using grok with an array of patterns. Hard to say more without knowing which fields you want from which messages.

Badger, as always thank you for reply!

I've read somthing about dissection and tried something but it didn't do what I wanted :smile:

I'd like to achieve something like that:

Jul 23 09:24:18 mmr mmr-core[5210]:  Aweg3AOMTs_1563866656876839.0.dn processDN() #7750 realtime: 1.382 ms

to see fields like:

date: Jul 23 09:24:18
host: mmr mmr-core[5210]
id: 1563866656876839 (here I have problem with separating only the number from Aweg3AOMTs_1563866656876839.0.dn
type: processDN() #7750
realtime: 1.382

The first three are the same in every log message what else is importat is the realtime...

Can you explain your code on one of the lines in my example please :pray:

You could try

    dissect { mapping => { "message" => "%{[@metadata][ts]->} %{+[@metadata][ts]} %{+[@metadata][ts]} %{host} %{process}: %{restOfLine}" } }
    date { match => [ "[@metadata][ts]", "MMM dd HH:mm:ss" ] }

    grok {
        pattern_definitions => { "LETTERSANDNUMBERS" => "[A-Za-z0-9]+" }
        match => {
            "restOfLine" => [
                "^Processing file \[%{LETTERSANDNUMBERS}_%{INT:correlationId}\.",
                "^ %{LETTERSANDNUMBERS}_%{INT:correlationId}\."
            ]
        }
    }
    if "realtime" in [message] {
        grok { match => { "restOfLine" => "^ %{NOTSPACE} %{DATA:type} realtime: %{NUMBER:realtime}" } }
    }

which would get you

         "host" => "mmr",
      "process" => "mmr-core[5147]",
   "restOfLine" => " Aweg3AOMTs_1563866656876839.mt processMTMessage() #12798 realtime: 5.684 ms",
"correlationId" => "1563866656876839",
         "type" => "processMTMessage() #12798",
     "realtime" => "5.684"

I know there is no need to define that LETTERSANDNUMBERS pattern, but I am drawing a blank on the right charset to use at the moment.

Hi Badger, thank you for your help!

Unfortunately when I trying your code it doesn't work :slightly_frowning_face:

When I run easy filter like

%{SYSLOGTIMESTAMP:logtimestamp} %{HOSTNAME:hostname} %{DATA:type} %{USERNAME:id} %{GREEDYDATA:rest}

and my config file looks like:

input {

  file {
path => "/home/vladbak2/ah.txt"
start_position => "beginning"
sincedb_path => "/dev/null"
  }

}



filter {

 grok {
 match => {"message" => "%{SYSLOGTIMESTAMP:logtimestamp} %{HOSTNAME:hostname} %{DATA:type} %{USERNAME:id} %{GREEDYDATA:rest}"}

   }


}





output {
  elasticsearch {
   hosts => ["localhost:9200"]
   index => "my_index1"
  }
}

And when I run /bin/logstash -f /etc/logstash/conf.d/load.conf

It gaves me following output:

[INFO ] 2019-07-24 08:24:39.952 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, :thread=>"#<Thread:0x2a05edd1 run>"}
[INFO ] 2019-07-24 08:24:41.146 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2019-07-24 08:24:41.446 [Ruby-0-Thread-1: /usr/share/logstash/lib/bootstrap/environment.rb:6] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2019-07-24 08:24:41.537 [[main]<file] observingtail - START, creating Discoverer, Watch with file and sincedb collections
[INFO ] 2019-07-24 08:24:43.100 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9601}

And file arrives to elasticsearch so I can see it in kibana just fine. After I change the config file for:

input {

  file {
path => "/home/vladbak2/ah.txt"
start_position => "beginning"
sincedb_path => "/dev/null"
  }

}



filter {

dissect { mapping => { "message" => "%{[@metadata][ts]->} %{+[@metadata][ts]} %{+[@metadata][ts]} %{host} %{process}: %{restOfLine}" } }
date { match => [ "[@metadata][ts]", "MMM dd HH:mm:ss" ] }

grok {
    pattern_definitions => { "LETTERSANDNUMBERS" => "[A-Za-z0-9]+" }
    match => {
        "restOfLine" => [
            "^Processing file \[%{LETTERSANDNUMBERS}_%{INT:correlationId}\.",
            "^ %{LETTERSANDNUMBERS}_%{INT:correlationId}\."
        ]
    }
}
if "realtime" in [message] {
    grok { match => { "restOfLine" => "^ %{NOTSPACE} %{DATA:type} realtime: %{NUMBER:realtime}" } }
}


}





output {
  elasticsearch {
   hosts => ["localhost:9200"]
   index => "my_index1"
  }
}

It gaves me almost the same output:

[INFO ] 2019-07-24 08:29:31.024 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, :thread=>"#<Thread:0x204a83d4 run>"}
[INFO ] 2019-07-24 08:29:31.969 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2019-07-24 08:29:32.315 [[main]<file] observingtail - START, creating Discoverer, Watch with file and sincedb collections
[INFO ] 2019-07-24 08:29:32.359 [Ruby-0-Thread-1: /usr/share/logstash/lib/bootstrap/environment.rb:6] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2019-07-24 08:29:34.134 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9601}

But this time it doesn't arrive to elasticsearch and I cannot see it in Kibana.

Before I run the modified config file I installed dissect filter plugin:

#cd /usr/share/logstash
# bin/logstash-plugin install logstash-filter-dissect

Maybe I need to install some other plugins or so... ?

PS: After 3 seconds it added WARN message:

[WARN ] 2019-07-24 09:10:14.381 [[main]>worker3] Dissector - Dissector mapping, field found in event but it was empty {"field"=>"message", "event"=>{"tags"=>["_dissectfailure"], "path"=>"/home/vladbak2/ah.txt", "host"=>"bak2", "@timestamp"=>2019-07-24T09:10:12.013Z, "@version"=>"1", "message"=>""}}

And I recieved following message in elasticsearch:

@version:1 tags:_dissectfailure, _grokparsefailure path:/home/vladbak2/ah.txt host:bak2 @timestamp:Jul 24, 2019 @ 11:08:53.030 message: Aweg3AOMTs_1563866656876839.0.dn processDN() #7750 realtime: 1.382 ms _id:-NE9I2wBuGK9b84mkQBu _type:_doc _index:my_index1 _score:1

Finally I found out that it arrives to the elasticsearch and I can find in console in format I expected but I cannot see it in Kibana.

Debug output after running /bin/logstash -f /etc/logstash/conf.d/load.conf :

[INFO ] 2019-07-24 11:44:28.438 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9601}
[WARN ] 2019-07-24 11:44:32.589 [[main]>worker3] Dissector - Dissector mapping, field found in event but it was empty {"field"=>"message", "event"=>{"host"=>"bak2", "path"=>"/home/vladbak2/ah.txt", "@version"=>"1", "@timestamp"=>2019-07-24T11:44:29.100Z, "tags"=>["_dissectfailure"], "message"=>""}}
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
         "host" => "mmr",
   "restOfLine" => " Aweg3AOMTs_1563866656876839.0.dn processDN() #7750 realtime: 1.382 ms",
     "realtime" => "1.382",
         "path" => "/home/vladbak2/ah.txt",
"correlationId" => "1563866656876839",
         "type" => "processDN() #7750",
     "@version" => "1",
   "@timestamp" => 2019-07-23T09:24:18.000Z,
      "process" => "mmr-core[5210]",
      "message" => "Jul 23 09:24:18 mmr mmr-core[5210]:  Aweg3AOMTs_1563866656876839.0.dn processDN() #7750 realtime: 1.382 ms"
}
{
      "host" => "bak2",
      "path" => "/home/vladbak2/ah.txt",
  "@version" => "1",
"@timestamp" => 2019-07-24T11:44:29.100Z,
      "tags" => [
    [0] "_dissectfailure",
    [1] "_grokparsefailure"
],
   "message" => ""
}

What I see in Kibana Discover is:

Jul 24, 2019 @ 13:44:29.100	host:bak2 path:/home/vladbak2/ah.txt @version:1 @timestamp:Jul 24, 2019 @ 13:44:29.100 tags:_dissectfailure, _grokparsefailure message: _id:VT7MI2wBgCzX-9NIFf98 _type:_doc _index:my_index1 _score:1
"message" => ""

You have blank lines in your input file. I would expect those to get _dissectfailure and _grokparsefailure.

It's fixed, thank you very much again!

Would you mind to explain this part of your code:

    grok {
    pattern_definitions => { "LETTERSANDNUMBERS" => "[A-Za-z0-9]+" }
    match => {
        "restOfLine" => [
            "^Processing file \[%{LETTERSANDNUMBERS}_%{INT:correlationId}\.",
            "^ %{LETTERSANDNUMBERS}_%{INT:correlationId}\."
        ]
    }
}

How could I devide the id Aweg3AOMTs_1563866656876839.0.dn for example on:

text: Aweg3AOMTs_
num1:  1563866656876839
num2:  0
suffix: dn

OK, so we started off with lines like these

Jul 23 09:24:16 mmr mmr-core[4268]: Processing file [Aweg3AOMTs_1563866656876839.mt], passing to thread [5147]
Jul 23 09:24:16 mmr mmr-core[5147]: Aweg3AOMTs_1563866656876839.mt database [SELECT carrier[...]

using dissect to remove the first part, that leaves us with this in the restOfLine field.

Processing file [Aweg3AOMTs_1563866656876839.mt], passing to thread [5147]
 Aweg3AOMTs_1563866656876839.mt database [SELECT carrier[...]

What I did in that grok pattern was to give a list of patterns to try against that field. Note that the patterns are anchored to the start of the field using ^, which means they fail to match very quickly.

If you want to parse the 4 fields from the id I would change that a little

    grok {
        match => {
            "restOfLine" => [
                "^Processing file \[%{NOTSPACE:correlationId}\]",
                "^ %{NOTSPACE:correlationId} "
            ]
        }
    }
    grok {
        pattern_definitions => { "SOMETEXT" => "[[:alnum:]]+" }
        match => {
            "correlationId" => [
                "^%{SOMETEXT:text}_%{INT:num1}\.%{INT:num2}\.%{DATA:suffix}$",
                "^%{SOMETEXT:text}_%{INT:num1}\.%{DATA:suffix}$"
            ]
        }
    }
1 Like

I think I understand now :smile:

Thank you for your time and great help!! :pray:

One more problem appeared and I don't know how to solve it. I think I tried every possible option to edit the config file which is:

input {

  file {
    path => "/home/vladbak2/ah.txt"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }

}    

filter {

    dissect { mapping => { "message" => "%{[@metadata][ts]->} %{+[@metadata][ts]} %{+[@metadata][ts]} %{host} %{process}: %{restOfLine}" } }
    date { match => [ "[@metadata][ts]", "MMM dd HH:mm:ss" ] }

    grok {
    match => {
        "restOfLine" => [
            "^Processing file \[%{NOTSPACE:correlationId}\]",
            "^ %{NOTSPACE:correlationId}"
        ]
    }
    }

    grok {

        pattern_definitons => {"SOMETEXT" => "[[:alnum:]]+"}
                match => {
                  "correlationId" => [
                        "^{SOMETEXT:text}_%{INT:num1}\.%{INT:num2}\.%{DATA:suffix}$",
                        "^{SOMETEXT:text}_%{INT:num1}\.%{DATA:suffix}
                   ]

                }

    }


    if "realtime" in [message] {
    grok { match => { "restOfLine" => "^ %{NOTSPACE} %{DATA:type} realtime: %{NUMBER:realtime}" } }
    }



    }

    output {
      elasticsearch {
       hosts => ["localhost:9200"]
       index =>  "my_index2"
      }

       stdout { codec => rubydebug }

    }

And it shows error in if conditional right " after if: if "realtime" in [message] { . When I remove the whole if conditional it starts showing the same error in output part right here: hosts => ["localhost:9200"]

[ERROR] 2019-07-24 19:29:37.838 [Converge PipelineAction::Create<main>] agent - Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of #, {, ,, ] at line 41, column 5 (byte 746) after filter {\n\ndissect { mapping => { \"message\" => \"%{[@metadata][ts]->} %{+[@metadata][ts]} %{+[@metadata][ts]} %{host} %{process}: %{restOfLine}\" } }\ndate { match => [ \"[@metadata][ts]\", \"MMM dd HH:mm:ss\" ] }\n\ngrok {\n    match => {\n        \"restOfLine\" => [\n            \"^Processing file \\[%{NOTSPACE:correlationId}\\]\",\n            \"^ %{NOTSPACE:correlationId}\"\n        ]\n    }\n}\n\ngrok {\n\n\tpattern_definitons => {\"SOMETEXT\" => \"[[:alnum:]]+\"}\n\t\tmatch => {\n\t\t  \"correlationId\" => [\n\t\t\t\"^{SOMETEXT:text}_%{INT:num1}\\.%{INT:num2}\\.%{DATA:suffix}$\",\n\t\t\t\"^{SOMETEXT:text}_%{INT:num1}\\.%{DATA:suffix}\n\t\t   ]\n\n\t\t}\n\t\t\n}\n\n\nif \"", :backtrace

What could be possibly wrong? I'm lost :cold_sweat:

Missing closing "

Oh my god I was looking for a brackets and commas ...

I fixed it and now have some problem with uknown setting pattern definition for grok

[ERROR] 2019-07-24 20:14:57.258 [Converge PipelineAction::Create<main>] grok - Unknown setting 'pattern_definitons' for grok
[ERROR] 2019-07-24 20:14:57.293 [Converge PipelineAction::Create<main>] agent - Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Something is wrong with your configuration.", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/config/mixin.rb:87:in `config_init'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:126:in `initialize'", "org/logstash/plugins/PluginFactoryExt.java:81:in `filter_delegator'", "org/logstash/plugins/PluginFactoryExt.java:251:in `plugin'", "org/logstash/execution/JavaBasePipelineExt.java:50:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:24:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:36:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:325:in `block in converge_state'"]}
[INFO ] 2019-07-24 20:14:58.218 [LogStash::Runner] runner - Logstash shut down.

That should be pattern_definitions, with another i near the end.

Finally it's correct... Thank you for your patience :))

But "id" isnt devidede into 3 parts...

{
         "tags" => [
    [0] "_grokparsefailure"
],
      "message" => "Jul 24 14:56:11 mmr mmr-core[5210]:  Aweg3AOMTs_1563866656876839.0.dn processDN() #7750 realtime: 1.382 ms",
   "@timestamp" => 2019-07-24T14:56:11.000Z,
      "process" => "mmr-core[5210]",
         "path" => "/home/vladbak2/ah.txt",
         "host" => "mmr",
     "realtime" => "1.382",
   "restOfLine" => " Aweg3AOMTs_1563866656876839.0.dn processDN() #7750 realtime: 1.382 ms",
     "@version" => "1",
"correlationId" => "Aweg3AOMTs_1563866656876839.0.dn",
         "type" => "processDN() #7750"

}

Missing % near start of line, missing $" at end of second line. It should be

"^%{SOMETEXT:text}_%{INT:num1}\.%{INT:num2}\.%{DATA:suffix}$",
"^%{SOMETEXT:text}_%{INT:num1}\.%{DATA:suffix}$"