I want hourly Filebeat indices.
In the doc for filebeat.yml, I only see :
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
I tried with
output:
logstash:
hosts: ["elk-docker:5044"]
index: 'filebeat-%{+yyyy.MM.dd.HH}'
But it doesn't work. In elasticsearch.log I see this error:
[2016-12-07 12:22:41,525][DEBUG][action.admin.indices.create] [Fury] [filebeat-%{+yyyy.MM.dd.HH}-2016.12.07] failed to create
[filebeat-%{+yyyy.MM.dd.HH}-2016.12.07] InvalidIndexNameException[Invalid index name [filebeat-%{+yyyy.MM.dd.HH}-2016.12.07], must be lowercase]
I use Filebeat version 1.2.1 on Linux RHEL 7.1.
Is it possible to have hourly index for Filebeat ?
And if yes, how can it be done ?
I just updated to filebeat 5.0.2 (and Elasticsearch 5.0.2, Logstash 5.0.2, and Kibana 5.0.2)
And on elasticsearch no filebeat-%{+yyyy.MM.dd.HH} index is created
Looking at filebeat.log, I see now these errors :
2016-12-07T17:38:04+01:00 ERR Failed to publish events caused by: EOF
2016-12-07T17:38:24+01:00 ERR Failed to publish events caused by: EOF
2016-12-07T17:38:54+01:00 ERR Failed to publish events caused by: EOF
Almost 40Gb/day but it can be much more if many error logs occur.
Logs are not kept on a physical server, the elk docker is on a Vmware VM.
And I would like to drop indices at an hourly frequency if there's too many of it.
By the way, can you help me with "Failed to publish events caused by: EOF" error in filebeat.log ?
logstash-plain.log says :
Our recommendation is to keep shards under 50GB, so in reality you aren't even doing that in a day. You're definitely going to be wasting resources by moving to hourly.
Yes I understand that 40 Gb is under 50 Gb.
But it's an average and if I got 400 Gb for just a day,
then I don't want to drop daily indices but hourly indices.
Stephen Siering told me that daily index is available in filebeat 5.0+
So what is the correct syntax to make it works ?
This will allow you to do weekly indices but if there's an unexpected burst of events, the indices will automatically roll over after hitting a specific max size.
You really want to avoid having an unnecessary amount of shards in your cluster as will cause an overhead and impact your cluster health and performance at some point. Two causes of this are daily indices for clusters that would better be suited to weekly or other rollover frequency and configuring indices to use far too many shards by default, often a combination of these two factors.
Thanks Peter.
I followed your advice and used the rollover api :
Today filebeat index is name filebeat-2016.12.08.
So I want it to rollover each hour or for testing let's say every 1000 docs :
OK now filebeat-2016.12.08-1 is created because of the rollover :
curl -XGET 'http://localhost:9200/_cat/indices?v'
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open filebeat-2016.12.08 qUduxdndTcWwYqBcdDH_lA 5 1 46053 0 13.8mb 13.8mb
yellow open filebeat-2016.12.08-1 yYSVbfufQROtXjL0aW9rYw 5 1 0 0 650b 650b
But first problem, logstash is still sending logs to filebeat-2016.12.08 and not to filebeat-2016.12.08-1
And secondly, 1 hour later or 1000 docs later, no filebeat-2016.12.08-2 is created.
As I understand, rollover is for an index alias and not for an index.
How can I tell Filebeat to put logs into the alias index and not the index ?
Oooohhh... I'm seeing you try to set the ES index on logstash output? This doesn't work, as index option in filebeat outputs is not supported by logstash output. It's logstash where you have to set the index name.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.