ahhh, good point. I am using test data and I added the since_db after the initial testing but since I have added new data to that directory.
In the /var/lib/logstash dir I have a plugins dir and uuid file. I don't see any since_db file to delete.
ahhh, good point. I am using test data and I added the since_db after the initial testing but since I have added new data to that directory.
In the /var/lib/logstash dir I have a plugins dir and uuid file. I don't see any since_db file to delete.
List each subdirectory of the plugins dir. Chances are good that the file plugin is in there, and that's where the sincedb resides by default.
ls /var/lib/logstash/plugins/inputs/file/
deleted the two sincedb files restarting logstash to see if that solves the issue
Now I see this in my log but still no data making it from the node and no errors:
[2017-02-21T15:09:39,548][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://198.119.28.124:9200/]}} [2017-02-21T15:09:39,554][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://198.119.28.124:9200/, :path=>"/"} [2017-02-21T15:09:39,661][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#} [2017-02-21T15:09:39,662][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil} [2017-02-21T15:09:39,716][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>50001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword"}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}} [2017-02-21T15:09:39,722][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>[#]} [2017-02-21T15:09:39,728][INFO ][logstash.pipeline ] Starting pipeline {"id"=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>250} [2017-02-21T15:09:39,733][INFO ][logstash.pipeline ] Pipeline main started [2017-02-21T15:09:39,773][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
still works firing it off from the command line.
How does the output differ if you erase/empty /var/log/logstash/logstash-plain.log
and run from systemd, and then run via the command-line?
Additionally, create a file
output and send the output to /tmp
or something, so you have another way of seeing if there is output.
when using output to elastic I get all kinds of WARN
15:15:43.294 [[main]>worker0] WARN logstash.filters.csv - Error parsing csv {:field=>"message", :source=>"2017-02-20T01:09:18.419444+00\tX.X.X.X\t49304\tX.X.X.X\t80\tPOST\ts.update.openx.com\t/2/4.23.0/413654/joeWruXXUrnyyZ1kDMnYJVARiMdVhXxm/postback?si=537285088&ti=580c18e2-36d5-42c6-8529-b038550a0b80&pc=538571947&r1=c16f1a5f-4dc3-4c46-9a29-b1f55106b9e9&di=r%3Dwww.abc2news.com&cb=1487552943&dt=4136541438897963103000&ci=413654&oz_tc=joeWruXXUrnyyZ1kDMnYJVARiMdVhXxm&oz_sc=39605a05e39750ef72c75741&oz_st=1487552946162&oz_v=4.23.0&dp=r%3Dwww.abc2news.com&oz_df=12661&oz_l=308&cv=3 HTTP/1.1\thttp://www.abc2news.com/\tMozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36\t\t6>;ILN$./'8?-/Ki&dU GDA:Px.bamAqpy{[\"uA@#)VL-|D.|O5H$=_L4!bRpu^Q<Lo*}R|0V8l5V^#h3@1R*{$M\"{k$mGPm03B-W$pr JW:# /VB5N`|:xxGkTk\"|(InBZC4_6<?P%:;;mL[EjTulk:R#hPR\"X,4s=mP:7J?eeTk*XO)SjR8E`Im&U'%_E0tc;VY=IfO2h9D}=\"XW5JvLXc$ap\"j+[;h1~Lx%YEY4^\"4L[h]eFUu?PN\"K5{C-p4m", :exception=>#}
These are going into the cluster but as "_csvparsefailure" I guess one thing I know is I need to work on getting urlsnarf logs into elastic better, which means more grok :(.
Strang when I output it to the screen using
When I run it from the command line I get nothing in the logstash-plain.log.
Running it with
stdout { codec => rubydebug }
it works fine and I get nicely parsed json printed to screen.
I just with I got the warns when I ran it with systemctl because then I would know it was trying to read the file.
the plain log shows
[2017-02-21T15:28:27,891][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://host1:9200/]}} [2017-02-21T15:28:27,895][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://host1:9200/, :path=>"/"} [2017-02-21T15:28:27,979][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#} [2017-02-21T15:28:27,988][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil} [2017-02-21T15:28:28,028][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>50001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword"}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}} [2017-02-21T15:28:28,035][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>[#]} [2017-02-21T15:28:28,039][INFO ][logstash.pipeline ] Starting pipeline {"id"=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>250} [2017-02-21T15:28:28,042][INFO ][logstash.pipeline ] Pipeline main started [2017-02-21T15:28:28,076][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
At this point, it seems we're having some traffic flow, it's just manifesting differently based on how you started. Is that accurate? In either case, you can change log.level
in /etc/logstash/logstash.yml
to debug
and see what you get in the output log file.
would say that I can get things into the cluster from logstash when running it from the command line but not from daemon.
When I run it from command line and tell it to go to elastic, I am getting some CSV parse errors which are most definitely my fault but at least the data goes there.
When I run it from daemon I get no data at all going to elasticsearch.
last attempt for today:
I ran with debug, it is definitely reading the conf file because it is showing it with logstash.filters.csv section of the log.
I then see this:
[2017-02-21T15:51:02,357][DEBUG][logstash.inputs.file ] _globbed_files: /home/tdesroch/test_url/*: glob is: [] [2017-02-21T15:51:03,755][DEBUG][logstash.pipeline ] Pushing flush onto pipeline [2017-02-21T15:51:08,755][DEBUG][logstash.pipeline ] Pushing flush onto pipeline [2017-02-21T15:51:13,755][DEBUG][logstash.pipeline ] Pushing flush onto pipeline
Which I am think is it trying to read things in that directory but I am not sure if it is.
The ls of the directory it is reading is:
-rwxr-xr-x. 1 tdesroch tdesroch 113M Feb 20 10:51 2017-02-19T00+00 -rwxr-xr-x. 1 tdesroch tdesroch 104M Feb 20 10:51 2017-02-19T01+00 -rwxr-xr-x. 1 tdesroch tdesroch 93M Feb 20 10:51 2017-02-19T02+00 -rwxr-xr-x. 1 tdesroch tdesroch 92M Feb 20 10:51 2017-02-19T03+00 -rwxr-xr-x. 1 tdesroch tdesroch 84M Feb 20 10:51 2017-02-19T04+00 -rwxr-xr-x. 1 tdesroch tdesroch 88M Feb 20 10:51 2017-02-19T05+00 -rwxr-xr-x. 1 tdesroch tdesroch 86M Feb 20 10:51 2017-02-19T06+00 -rwxr-xr-x. 1 tdesroch tdesroch 87M Feb 20 10:51 2017-02-19T07+00 -rwxr-xr-x. 1 tdesroch tdesroch 89M Feb 20 10:51 2017-02-19T08+00 -rwxr-xr-x. 1 tdesroch tdesroch 94M Feb 20 10:51 2017-02-19T09+00 -rwxr-xr-x. 1 tdesroch tdesroch 93M Feb 20 10:51 2017-02-19T10+00 -rwxr-xr-x. 1 tdesroch tdesroch 100M Feb 20 10:51 2017-02-19T11+00 -rwxr-xr-x. 1 tdesroch tdesroch 82M Feb 20 10:51 2017-02-19T12+00 -rwxr-xr-x. 1 tdesroch tdesroch 99M Feb 20 10:51 2017-02-19T13+00 -rwxr-xr-x. 1 tdesroch tdesroch 129M Feb 20 10:51 2017-02-19T14+00 -rwxr-xr-x. 1 tdesroch tdesroch 132M Feb 20 10:51 2017-02-19T15+00 -rwxr-xr-x. 1 tdesroch tdesroch 114M Feb 20 10:51 2017-02-19T16+00 -rwxr-xr-x. 1 tdesroch tdesroch 136M Feb 20 10:51 2017-02-19T17+00 -rwxr-xr-x. 1 tdesroch tdesroch 131M Feb 20 10:51 2017-02-19T18+00 -rwxr-xr-x. 1 tdesroch tdesroch 117M Feb 20 10:51 2017-02-19T19+00 -rwxr-xr-x. 1 tdesroch tdesroch 110M Feb 20 10:51 2017-02-19T20+00 -rwxr-xr-x. 1 tdesroch tdesroch 111M Feb 20 10:51 2017-02-19T21+00 -rwxr-xr-x. 1 tdesroch tdesroch 106M Feb 20 10:52 2017-02-19T22+00 -rwxr-xr-x. 1 tdesroch tdesroch 125M Feb 20 10:52 2017-02-19T23+00 -rwxr-xr-x. 1 tdesroch tdesroch 106M Feb 21 12:03 2017-02-20T00+00 -rwxr-xr-x. 1 tdesroch tdesroch 107M Feb 21 12:03 2017-02-20T01+00 -rwxr-xr-x. 1 tdesroch tdesroch 115M Feb 21 12:03 2017-02-20T02+00 -rwxr-xr-x. 1 tdesroch tdesroch 112M Feb 21 12:03 2017-02-20T03+00 -rwxr-xr-x. 1 tdesroch tdesroch 91M Feb 21 12:03 2017-02-20T04+00 -rwxr-xr-x. 1 tdesroch tdesroch 108M Feb 21 12:03 2017-02-20T05+00 -rwxr-xr-x. 1 tdesroch tdesroch 167M Feb 21 12:03 2017-02-20T06+00 -rwxr-xr-x. 1 tdesroch tdesroch 115M Feb 21 12:03 2017-02-20T07+00 -rwxr-xr-x. 1 tdesroch tdesroch 104M Feb 21 12:03 2017-02-20T08+00 -rwxr-xr-x. 1 tdesroch tdesroch 109M Feb 21 12:03 2017-02-20T09+00 -rwxr-xr-x. 1 tdesroch tdesroch 120M Feb 21 12:03 2017-02-20T10+00 -rwxr-xr-x. 1 tdesroch tdesroch 116M Feb 21 12:03 2017-02-20T11+00 -rwxr-xr-x. 1 tdesroch tdesroch 123M Feb 21 12:03 2017-02-20T12+00 -rwxr-xr-x. 1 tdesroch tdesroch 298M Feb 21 12:03 2017-02-20T13+00 -rwxr-xr-x. 1 tdesroch tdesroch 241M Feb 21 12:03 2017-02-20T14+00 -rwxr-xr-x. 1 tdesroch tdesroch 203M Feb 21 12:03 2017-02-20T15+00 -rwxr-xr-x. 1 tdesroch tdesroch 215M Feb 21 12:03 2017-02-20T16+00 -rwxr-xr-x. 1 tdesroch tdesroch 242M Feb 21 12:03 2017-02-20T17+00 -rwxr-xr-x. 1 tdesroch tdesroch 203M Feb 21 12:03 2017-02-20T18+00 -rwxr-xr-x. 1 tdesroch tdesroch 229M Feb 21 12:03 2017-02-20T19+00 -rwxr-xr-x. 1 tdesroch tdesroch 194M Feb 21 12:03 2017-02-20T20+00 -rwxr-xr-x. 1 tdesroch tdesroch 176M Feb 21 12:03 2017-02-20T21+00 -rwxr-xr-x. 1 tdesroch tdesroch 154M Feb 21 12:03 2017-02-20T22+00 -rwxr-xr-x. 1 tdesroch tdesroch 143M Feb 21 12:04 2017-02-20T23+00
They are logs from urlsnarf.
Thanks again.
I wonder if it is reading also.
I understand that you are doing dev work right now. What is your eventual plan for consuming these files? Is it still the file input? Or something else, like reading over the network? I ask, because it seems you're going to have a continuous stream of new files in the directory. This is not a good fit for the file input plugin, mostly because of so many file handles to track via sincedb (they can end up getting reused, which gives false positives for a file having been read). I had no idea the number of files you were trying to read with a glob. The Logstash file input plugin can also get bogged down trying to read from each simultaneously when using globs.
In the past, when faced with a similar case (CDN access logs pulled from a remote source), I wrote a script to run periodically and catch new files, keep track of what had already been read, and send via a network port to Logstash, completely bypassing the file plugin. Of course, that was before there was filebeat
, though I've never tried it for this use case (lots of files incoming into a single directory over time).
I don't know what will be the best fit for you, since I do not know how file will be arriving, how many, and how often, but I do recommend giving it some consideration.
Does the logstash user have read permissions for the files in /home/tdesroch/test_url/*
? If the Logstash user cannot navigate past /home
, and /home/tdesroch
, and /home/tdesroch/test_url
, it won't matter if the files in the final directory are all readable.
My end goal is to consume dsniff urlsnarf logs as they get written (I have
inherited some architecture and I'm toying to unscramble it). From what I
read urlsnarf should be in CLF so I should be able to use an existing grok
pattern to consume them. The suricata stuff is easy ever logs can be
written in JSON.
What I am testing here are dsniff urlsnarf logs I pulled from an existing
sensor to proof the concept of them getting ingested and what they would
look like for an analyst. I want to show the customer what the value of
this tool is (elastic not dsniff, it has its place but it's dated).
The end game should be bro sensors and suricata sensors writing to JSON
logs and shipped with filebeat to a logstash cluster or ingest nodes then I
to elastic for a security team of analysts.
I have been using the 2.X suite of elk tools for the past 3 years so I am
learning the new architecture, ingest nodes, and configs of the new elastic
stack.
The issue, as with most, was user error. I had the test data in my home directory and logstash user could not read it. Once I fixed the permissions it worked as expected.
NOTE: nothing in the logs indicated a read error, maybe some sort of error logging indicating read failures would be helpful.
Thanks again for the assistance.
I think the empty glob: []
may have been the indicator. Is it still empty after fixing that?
I will have to check. While tailing the var/log/logstash/logstash-plain.log it fills with scrolling data so I will look through and see if I can find it.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.