Logstash file input works on one path but not others


(Scott Newby) #1

I was able to load the json logs we're generating, flattening with a ruby script, and then loading to elastic. We're on a windows server 2012 environment. Logstash 6.3.0.

This works on one path, but not another

input{
file {
#file monitor - glob to monitor multiple directories
# this works - path => ["/Jenkins/jobs/Testing Pipeline/jobs/Pipeline/builds/star-star/cucumber-html-reports/.cache/star.json"]
# this doesn't work - path => ["/Jenkins/jobs/Testing Pipeline_UnderDevelopment/jobs/projecg-name_herewithunderscore/builds/star-star/cucumber-html-reports/.cache/reports/star.json"]
# this also doesn't work - path => ["/ELK/Test/reports/star.json"]
start_position => "beginning"
#use the multiline codec to treat the message as one big message
codec => multiline { pattern => "something" negate => true what => "previous" auto_flush_interval => 2 }
}
}

star is asterisk for wildcard matching.

This is running in my pipeline.yml, and has been tested against each of the path cases above.

In the first 2 cases - .cache is not a hidden directory, and that wouldn't explain why the 3rd path is not working. I was wondering if the underscore was causing an issue.

I've deleted the data directory in between executions to remove any tracking, besides the files are not duplicates. I've checked permissions between all three and they are all the same.

Last log file below:

[2018-10-11T16:00:57,106][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.queue", :path=>"c:/ELK/logstash/data/queue"}
[2018-10-11T16:00:57,122][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.dead_letter_queue", :path=>"c:/ELK/logstash/data/dead_letter_queue"}
[2018-10-11T16:00:57,525][INFO ][logstash.agent ] No persistent UUID file found. Generating new UUID {:uuid=>"eeb40e51-c9d7-4663-b2dd-3df0db29cc92", :path=>"c:/ELK/logstash/data/uuid"}
[2018-10-11T16:00:58,520][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.3.0"}
[2018-10-11T16:01:04,906][INFO ][logstash.filters.ruby.script] Test run complete {:script_path=>"/ELK/logstash/config/Code/cucumber-agg-json.rb", :results=>{:passed=>0, :failed=>0, :errored=>0}}
[2018-10-11T16:01:05,159][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"CucumberAggregationAlt", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-10-11T16:01:05,779][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2018-10-11T16:01:05,795][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[2018-10-11T16:01:06,089][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2018-10-11T16:01:06,303][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-10-11T16:01:06,314][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the type event field won't be used to determine the document _type {:es_version=>6}
[2018-10-11T16:01:06,348][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2018-10-11T16:01:06,379][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"default"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2018-10-11T16:01:06,449][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost"]}
[2018-10-11T16:01:07,754][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"CucumberAggregationAlt", :thread=>"#<Thread:0x52219e69 run>"}
[2018-10-11T16:01:07,881][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:CucumberAggregationAlt], :non_running_pipelines=>[]}
[2018-10-11T16:01:08,546][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2018-10-11T16:15:59,122][WARN ][logstash.runner ] SIGINT received. Shutting down.
[2018-10-11T16:16:00,298][INFO ][logstash.pipeline ] Pipeline has terminated {:pipeline_id=>"CucumberAggregationAlt", :thread=>"#<Thread:0x52219e69 run>"}

I'm at a loss here - Scott


(Scott Newby) #2

This appears to be a data ingestion issue with the json we're trying to ingest, not a path issue. I moved known working files to the path(s) in question and logstash was able to access, and vice-versa to validate.


(Dave Moore) #3

@Scott_Newby

Can you execute the following pipeline in your terminal and share your observations? Ideally you will see Logstash printing parsed log messages to your terminal.

input {
    file {
        path => ["/path/to/file"] # Replace me
        start_position => "beginning"
        sincedb_path => "NUL"
        codec => multiline {
            pattern => "something" # Replace me
            negate => true
            what => "previous"
            auto_flush_interval => 2
        }
    }
}

output {
    stdout {
        codec => rubydebug
    }
}

Please take note:

  • Replace the values of path and pattern.
  • I added sincedb_path => "NUL" to ensure that the file is read from the beginning whenever you run this pipeline. (Details)
  • The stdout output will print each message to your terminal in a structured format, allowing you to see if/how the messages are being parsed.

(Scott Newby) #4

The output of the logs we were ingesting changed from multi-line JSON to single line JSON, which might make things easier.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.