Hourly index for filebeat

Hello,

I want hourly Filebeat indices.
In the doc for filebeat.yml, I only see :
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"

I tried with
output:
logstash:
hosts: ["elk-docker:5044"]
index: 'filebeat-%{+yyyy.MM.dd.HH}'

But it doesn't work. In elasticsearch.log I see this error:
[2016-12-07 12:22:41,525][DEBUG][action.admin.indices.create] [Fury] [filebeat-%{+yyyy.MM.dd.HH}-2016.12.07] failed to create
[filebeat-%{+yyyy.MM.dd.HH}-2016.12.07] InvalidIndexNameException[Invalid index name [filebeat-%{+yyyy.MM.dd.HH}-2016.12.07], must be lowercase]

I use Filebeat version 1.2.1 on Linux RHEL 7.1.

Is it possible to have hourly index for Filebeat ?
And if yes, how can it be done ?

Regards
Laurent

The feature your using is only available in filebeat 5.0+

Thank you Steffen.

I just updated to filebeat 5.0.2 (and Elasticsearch 5.0.2, Logstash 5.0.2, and Kibana 5.0.2)
And on elasticsearch no filebeat-%{+yyyy.MM.dd.HH} index is created

Looking at filebeat.log, I see now these errors :
2016-12-07T17:38:04+01:00 ERR Failed to publish events caused by: EOF
2016-12-07T17:38:24+01:00 ERR Failed to publish events caused by: EOF
2016-12-07T17:38:54+01:00 ERR Failed to publish events caused by: EOF

Here is an extract of my filebeat.yml :

  logstash:
    hosts: ["elk-docker:5044"]
    index: 'filebeat-%{+yyyy.MM.dd.HH}'
    ssl:
      certificate_authorities: ["/etc/pki/tls/certs/logstash-beats.crt"]

Laurent

Unless you have large amounts of data, very large, having hourly indices doesn't make much sense and you end up wasting resources.

2 Likes

Indeed it is "large amounts of data"

How large?

Hello Mark,

Almost 40Gb/day but it can be much more if many error logs occur.
Logs are not kept on a physical server, the elk docker is on a Vmware VM.
And I would like to drop indices at an hourly frequency if there's too many of it.

By the way, can you help me with "Failed to publish events caused by: EOF" error in filebeat.log ?
logstash-plain.log says :

[2016-12-08T08:14:26,598][WARN ][logstash.outputs.elasticsearch] Failed action. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-%{+yyyy.MM.dd.HH}-2016.12.08", :_type=>"log", :_routing=>nil}, 2016-12-08T08:14:23.407Z lmtesm1d.dmz.e-i.net 10.46.41.222 - - [08/Dec/2016:09:14:05 +0100] "GET /resource/1.0.0/artifact/_system/governance/trunk/restservices/3.0.1 HTTP/1.1" 200 90 "-" "Mozilla/4.0 [en] (WinNT; I)"], :response=>{"index"=>{"_index"=>"filebeat-%{+yyyy.MM.dd.HH}-2016.12.08", "_type"=>"log", "_id"=>nil, "status"=>400, "error"=>{"type"=>"invalid_index_name_exception", "reason"=>"Invalid index name [filebeat-%{+yyyy.MM.dd.HH}-2016.12.08], must be lowercase", "index_uuid"=>"_na_", "index"=>"filebeat-%{+yyyy.MM.dd.HH}-2016.12.08"}}}}

It should'nt be _index"=>"filebeat-%{+yyyy.MM.dd.HH}-2016.12.08"
but _index"=>"filebeat-2016.12.08.09"

What is wrong with filebeat.yml ?

  logstash:
    hosts: ["elk-docker:5044"]
    index: 'filebeat-%{+yyyy.MM.dd.HH}'

Regards
Laurent

That's not large enough to do hourly indices.

Our recommendation is to keep shards under 50GB, so in reality you aren't even doing that in a day. You're definitely going to be wasting resources by moving to hourly.

Yes I understand that 40 Gb is under 50 Gb.
But it's an average and if I got 400 Gb for just a day,
then I don't want to drop daily indices but hourly indices.
Stephen Siering told me that daily index is available in filebeat 5.0+
So what is the correct syntax to make it works ?

You might find some benefit using the Rollover Index API:
https://www.elastic.co/guide/en/elasticsearch/reference/5.0/indices-rollover-index.html

This will allow you to do weekly indices but if there's an unexpected burst of events, the indices will automatically roll over after hitting a specific max size.

You really want to avoid having an unnecessary amount of shards in your cluster as will cause an overhead and impact your cluster health and performance at some point. Two causes of this are daily indices for clusters that would better be suited to weekly or other rollover frequency and configuring indices to use far too many shards by default, often a combination of these two factors.

2 Likes

Thanks Peter.
I followed your advice and used the rollover api :
Today filebeat index is name filebeat-2016.12.08.
So I want it to rollover each hour or for testing let's say every 1000 docs :

curl -XPOST 'localhost:9200/filebeat/_rollover/filebeat-2016.12.08-1?pretty' -d'
 {
   "conditions": {
     "max_age":   "1h",
     "max_docs":  1000
   }
 }'

{
  "old_index" : "filebeat-2016.12.08",
  "new_index" : "filebeat-2016.12.08-1",
  "rolled_over" : true,
  "dry_run" : false,
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "conditions" : {
    "[max_age: 1h]" : true,
    "[max_docs: 1000]" : true
  }
}

OK now filebeat-2016.12.08-1 is created because of the rollover :

curl -XGET 'http://localhost:9200/_cat/indices?v'
health status index                 uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   filebeat-2016.12.08   qUduxdndTcWwYqBcdDH_lA   5   1      46053            0     13.8mb         13.8mb
yellow open   filebeat-2016.12.08-1 yYSVbfufQROtXjL0aW9rYw   5   1          0            0       650b           650b

But first problem, logstash is still sending logs to filebeat-2016.12.08 and not to filebeat-2016.12.08-1
And secondly, 1 hour later or 1000 docs later, no filebeat-2016.12.08-2 is created.

As I understand, rollover is for an index alias and not for an index.
How can I tell Filebeat to put logs into the alias index and not the index ?

Oooohhh... I'm seeing you try to set the ES index on logstash output? This doesn't work, as index option in filebeat outputs is not supported by logstash output. It's logstash where you have to set the index name.

2 Likes

In /etc/logstash/conf.d/30-output.conf there is :

output {
  elasticsearch {
    hosts => ["localhost"]
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

I replaced index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
with index => "filebeat-%{+YYYY.MM.dd.HH}"

After restarted logstash and deleting existing indices, now it works.

Thank you Steffen.

Best regards
Laurent

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.