Logstash pipeline not ingesting entire file


#1

Hi,

I'm having an issue with my pipeline not reading all of the lines in my file. I am sure the pattern matches all of the lines.

I am using a sample log file with 5 lines of standard syntax for my grok pattern. However, I see an output for the first three lines and then Logstash just hangs. I then do ctrl-c and I get the message asking if I want to terminate the batch job. Before I answer, the fourth line gets pushed through the pipeline. I still then have one more line not being read. This happens when I try parsing other files, too. It seems like it always gets stuck on the second to last line, and then the final line after it pushes from the ctrl-c.

I apologize if any of this is unclear. I have pasted the output from my Logstash console below.

Logstash output:

D:\apps\ELK2\logstash-5.3.0\bin>logstash -f tesb.conf
Sending Logstash's logs to D:\apps\ELK2\logstash-5.3.0\logs which is now configu
red via log4j2.properties
[2017-04-28T12:14:32,425][INFO ][logstash.outputs.elasticsearch] Elasticsearch p
ool URLs updated {:changes=>{:removed=>[], :added=>[http://cii6w760:9201/]}}
[2017-04-28T12:14:32,425][INFO ][logstash.outputs.elasticsearch] Running health
check to see if an Elasticsearch connection is working {:healthcheck_url=>http:/
/cii6w760:9201/, :path=>"/"}
[2017-04-28T12:14:32,534][WARN ][logstash.outputs.elasticsearch] Restored connec
tion to ES instance {:url=>#<URI::HTTP:0x1fcbd811 URL:http://cii6w760:9201/>}
[2017-04-28T12:14:32,534][INFO ][logstash.outputs.elasticsearch] Using mapping t
emplate from {:path=>nil}
[2017-04-28T12:14:32,565][INFO ][logstash.outputs.elasticsearch] Attempting to i
nstall template {:manage_template=>{"template"=>"logstash-*", "version"=>50001,
"settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=
>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"pa
th_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text"
, "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"str
ing", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=
>"keyword"}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"
=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"d
ynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_po
int"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}
}}}}}
[2017-04-28T12:14:32,565][INFO ][logstash.outputs.elasticsearch] New Elasticsear
ch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>[#<URI::Generic:0
x53548e22 URL://cii6w760:9201>]}
[2017-04-28T12:14:32,627][WARN ][logstash.pipeline        ] Defaulting pipeline
worker threads to 1 because there are some filters that might not work with mult
iple worker threads {:count_was=>8, :filters=>["memorize"]}
[2017-04-28T12:14:32,643][INFO ][logstash.pipeline        ] Starting pipeline {"
id"=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.
delay"=>5, "pipeline.max_inflight"=>125}
[2017-04-28T12:14:32,879][INFO ][logstash.pipeline        ] Pipeline main starte
d
[2017-04-28T12:14:32,973][INFO ][logstash.agent           ] Successfully started
 Logstash API endpoint {:port=>9600}
{
          "data3" => "com.teamcenter.esb.logger",
      "tesb_date" => "2017-04-22 15:17:29,574",
          "data2" => ".teamcenter.esb.logger.ESBLogger ",
          "data1" => "onsumer[process]",
        "message" => "No mapping found in registry for template name: design tem
plate and/or partition name: APALab Partition\r",
           "type" => "log",
           "path" => "d:/apps/syslogs/trythis/TestTesb.log",
    "environment" => "TestTesb",
     "@timestamp" => 2017-04-28T16:14:32.910Z,
      "log-level" => "WARN",
       "@version" => "1",
           "host" => "CII6W760",
           "num1" => "140",
        "decimal" => "10.1.6",
           "num2" => "219"
}
{
          "data3" => "org.apache.commons.dbcp2",
      "tesb_date" => "2017-04-22 15:00:06,967",
          "data2" => ".commons.dbcp2.PoolingDataSource  ",
          "data1" => "s4j.datasource])",
        "message" => "PoolableConnectionFactory not linked to pool. Calling setP
ool() to fix the configuration.\r",
           "type" => "log",
           "path" => "d:/apps/syslogs/trythis/TestTesb.log",
    "environment" => "TestTesb",
     "@timestamp" => 2017-04-28T16:14:32.910Z,
      "log-level" => "WARN",
       "@version" => "1",
           "host" => "CII6W760",
           "num1" => "65",
        "decimal" => "2.0.1",
           "num2" => "263"
}
{
          "data3" => "org.apache.camel.camel-jms",
      "tesb_date" => "2017-04-22 15:01:16,300",
          "data2" => "nent.jms.reply.QueueReplyManager ",
          "data1" => "qtp718023231-131",
        "message" => "Endpoint[activemq://queue:process?replyTo=sharedReplyQueue
&requestTimeout=3600000] is using a shared reply queue, which is not as fast as
alternatives. See more detail at the section 'Request-reply over JMS' at http://
camel.apache.org/jms\r",
           "type" => "log",
           "path" => "d:/apps/syslogs/trythis/TestTesb.log",
    "environment" => "TestTesb",
     "@timestamp" => 2017-04-28T16:14:32.926Z,
      "log-level" => "WARN",
       "@version" => "1",
           "host" => "CII6W760",
           "num1" => "145",
        "decimal" => "2.15.2",
           "num2" => "253"
}
[2017-04-28T12:14:42,219][WARN ][logstash.runner          ] SIGINT received. Shu
tting down the agent.
^CTerminate batch job (Y/N)? [2017-04-28T12:14:42,219][WARN ][logstash.agent
       ] stopping pipeline {:id=>"main"}
{
          "data3" => "com.teamcenter.esb.logger",
      "tesb_date" => "2017-04-22 15:12:21,166",
          "data2" => ".teamcenter.esb.logger.ESBLogger ",
          "data1" => "onsumer[process]",
        "message" => "error occurred while parsing postDate property.null\r",
           "type" => "log",
           "path" => "d:/apps/syslogs/trythis/TestTesb.log",
    "environment" => "TestTesb",
     "@timestamp" => 2017-04-28T16:14:42.987Z,
      "log-level" => "ERROR",
       "@version" => "1",
           "host" => "CII6W760",
           "num1" => "131",
        "decimal" => "10.1.6",
           "num2" => "219"
}

Also, I am using a Windows machine.

Thank you for your help!

Miranda


#2

Because of a character limit, I was not able to paste the input file in the original post. I have pasted it below.

File input:

2017-04-22 15:17:29,574 | WARN  | onsumer[process] | .teamcenter.esb.logger.ESBLogger  140 | 219 - com.teamcenter.esb.logger - 10.1.6 | No mapping found in registry for template name: design 

template and/or partition name: APALab Partition
2017-04-22 15:00:06,967 | WARN  | s4j.datasource])| .commons.dbcp2.PoolingDataSource   65 | 263 - org.apache.commons.dbcp2 - 2.0.1 | PoolableConnectionFactory not linked to pool. Calling 

setPool() to fix the configuration.
2017-04-22 15:01:16,300 | WARN  | qtp718023231-131 | nent.jms.reply.QueueReplyManager  145 | 253 - org.apache.camel.camel-jms - 2.15.2 | Endpoint[activemq://queue:process?

replyTo=sharedReplyQueue&requestTimeout=3600000] is using a shared reply queue, which is not as fast as alternatives. See more detail at the section 'Request-reply over JMS' at 

http://camel.apache.org/jms
2017-04-22 15:12:21,166 | ERROR | onsumer[process] | .teamcenter.esb.logger.ESBLogger  131 | 219 - com.teamcenter.esb.logger - 10.1.6 | error occurred while parsing postDate property.null
2017-04-22 15:12:29,512 | WARN  | onsumer[process] | .teamcenter.esb.logger.ESBLogger  140 | 219 - com.teamcenter.esb.logger - 10.1.6 | No mapping found in registry for template name: design 

template and/or partition name: APALab Partition

Thank you!
Miranda


(Andrew Cholakian) #3

@dolanmk the reason you're seeing the final line after Ctrl-C is that logstash will flush any multiline buffers on shutdown.

My best guess is that you're using Multiline in some capacity, and that the terminating line isn't being matched correctly.

It's hard to see without seeing sample data / your config.


#4

@Andrew_Cholakian1

I apologize for the delay in my response.

I am using Multiline. Here is a sample from my config file. DATESTAMP_TESB is a custom pattern matching the date stamp in the log files I am sending.

  codec => multiline {
    patterns_dir => ["./patterns""]
    pattern => "^%{DATESTAMP_TESB}"
    negate => true
    what => previous
  }

However, in the sample log file that I posted above, multiline shouldn't be used. They may have formatted incorrectly in my post, but in the actual file there are only 5 log lines, each starting with the date stamp.

Thank you again for your help!
Miranda


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.