Hi
Have an all in one ELK setup, where Filebeat, Logstash and Elasticsearch reside on the same machine. Currently, I am able process live changes on designated log files, process them and display them through Kibana. To process offline log files, i.e., when someone just drops a tar file, I created another directory, in host and specified it in filebeat.yml, which is distinct from the ones that I use for live processing. This generally works fine. Problems I am having, wrt to offline processing are as follows:
Static offline log file is ingested at the same rate as the live ones, I suppose as dictated by bulk_max_size, and I would like it to be ingested at much faster rate
Looking under "Discover" tab, within Kibana, the X axist (@timestamp) is based on the current system, and I am wondering if I can change this to start from offline file's creation time. That is, if current time is 14:00; however, file's creation time is 09:00, I would like to see X axis starting from 09:00.
Here is my filebeat.yml:
filebeat:
prospectors:
-
paths:
- /var/log/secure
- /var/log/messages
# - /var/log/*.log
Static offline log file is ingested at the same rate as the live ones, I suppose as dictated by bulk_max_size, and I would like it to be ingested at much faster rate
What rate are you getting? Is it Filebeat, Logstash, or Elasticsearch that's the bottleneck?
Looking under "Discover" tab, within Kibana, the X axist (@timestamp) is based on the current system, and I am wondering if I can change this to start from offline file's creation time. That is, if current time is 14:00; however, file's creation time is 09:00, I would like to see X axis starting from 09:00.
If the lines contain a timestamp you should use Logstash's date filter to parse it.
Thank you for your reply.
Regrading the ingestion rate, this is what I've done:
Ran online mode for around 1 minute, and observed that there were 5 items proceed and eventually displayed by Kibana, i.e., while under "Discover" tab saw 5 bars.
Stopped the online log generation, copied over the generated log file into /tmp_log/monitor/offline/*.log, and noted that again it takes ~ one minute to process and display the log data, which generated the same number of bars.
I doubt if its a bottleneck issue, most likely, the issue is due to the fact that I am using the same configuration, for ES, Logstash and Filebeat, on both online and offline modes. I am wondering if it is possible to change relevant parameters, e.g., bulk_max_size, based on directory of origin, or other means, to allow for faster processing.
Thank you for advice on date-filter. It would do nicely.
Cheers,
Ran online mode for around 1 minute, and observed that there were 5 items proceed and eventually displayed by Kibana, i.e., while under "Discover" tab saw 5 bars.
Not sure what you mean here. The number of vertical bars in the Discover tab isn't the same as the number of processed messages.
I doubt if its a bottleneck issue, most likely, the issue is due to the fact that I am using the same configuration, for ES, Logstash and Filebeat, on both online and offline modes.
I strongly doubt that offline or online "mode" has anything to do with this.
Thank you again. To be clear then are you saying that when a static log file is used as an input, it is in fact ingested and processed at once? (I.e., the observed behaviour, under Discover tab, is just a visual artefact.)
Cheers,
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.