Hourly index for filebeat

lolo67 · December 7, 2016, 12:30pm

Hello,

I want hourly Filebeat indices.
In the doc for filebeat.yml, I only see :
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"

I tried with
output:
logstash:
hosts: ["elk-docker:5044"]
index: 'filebeat-%{+yyyy.MM.dd.HH}'

But it doesn't work. In elasticsearch.log I see this error:
[2016-12-07 12:22:41,525][DEBUG][action.admin.indices.create] [Fury] [filebeat-%{+yyyy.MM.dd.HH}-2016.12.07] failed to create
[filebeat-%{+yyyy.MM.dd.HH}-2016.12.07] InvalidIndexNameException[Invalid index name [filebeat-%{+yyyy.MM.dd.HH}-2016.12.07], must be lowercase]

I use Filebeat version 1.2.1 on Linux RHEL 7.1.

Is it possible to have hourly index for Filebeat ?
And if yes, how can it be done ?

Regards
Laurent

steffens · December 7, 2016, 12:45pm

The feature your using is only available in filebeat 5.0+

lolo67 · December 7, 2016, 3:16pm

Thank you Steffen.

I just updated to filebeat 5.0.2 (and Elasticsearch 5.0.2, Logstash 5.0.2, and Kibana 5.0.2)
And on elasticsearch no filebeat-%{+yyyy.MM.dd.HH} index is created

Looking at filebeat.log, I see now these errors :
2016-12-07T17:38:04+01:00 ERR Failed to publish events caused by: EOF
2016-12-07T17:38:24+01:00 ERR Failed to publish events caused by: EOF
2016-12-07T17:38:54+01:00 ERR Failed to publish events caused by: EOF

Here is an extract of my filebeat.yml :

  logstash:
    hosts: ["elk-docker:5044"]
    index: 'filebeat-%{+yyyy.MM.dd.HH}'
    ssl:
      certificate_authorities: ["/etc/pki/tls/certs/logstash-beats.crt"]

Laurent

warkolm · December 7, 2016, 8:36pm

Unless you have large amounts of data, very large, having hourly indices doesn't make much sense and you end up wasting resources.

lolo67 · December 8, 2016, 6:42am

Indeed it is "large amounts of data"

warkolm · December 8, 2016, 6:44am

How large?

lolo67 · December 8, 2016, 8:22am

Hello Mark,

Almost 40Gb/day but it can be much more if many error logs occur.
Logs are not kept on a physical server, the elk docker is on a Vmware VM.
And I would like to drop indices at an hourly frequency if there's too many of it.

By the way, can you help me with "Failed to publish events caused by: EOF" error in filebeat.log ?
logstash-plain.log says :

[2016-12-08T08:14:26,598][WARN ][logstash.outputs.elasticsearch] Failed action. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-%{+yyyy.MM.dd.HH}-2016.12.08", :_type=>"log", :_routing=>nil}, 2016-12-08T08:14:23.407Z lmtesm1d.dmz.e-i.net 10.46.41.222 - - [08/Dec/2016:09:14:05 +0100] "GET /resource/1.0.0/artifact/_system/governance/trunk/restservices/3.0.1 HTTP/1.1" 200 90 "-" "Mozilla/4.0 [en] (WinNT; I)"], :response=>{"index"=>{"_index"=>"filebeat-%{+yyyy.MM.dd.HH}-2016.12.08", "_type"=>"log", "_id"=>nil, "status"=>400, "error"=>{"type"=>"invalid_index_name_exception", "reason"=>"Invalid index name [filebeat-%{+yyyy.MM.dd.HH}-2016.12.08], must be lowercase", "index_uuid"=>"_na_", "index"=>"filebeat-%{+yyyy.MM.dd.HH}-2016.12.08"}}}}

It should'nt be _index"=>"filebeat-%{+yyyy.MM.dd.HH}-2016.12.08"
but _index"=>"filebeat-2016.12.08.09"

What is wrong with filebeat.yml ?

  logstash:
    hosts: ["elk-docker:5044"]
    index: 'filebeat-%{+yyyy.MM.dd.HH}'

Regards
Laurent

warkolm · December 8, 2016, 8:24am

That's not large enough to do hourly indices.

Our recommendation is to keep shards under 50GB, so in reality you aren't even doing that in a day. You're definitely going to be wasting resources by moving to hourly.

lolo67 · December 8, 2016, 8:42am

Yes I understand that 40 Gb is under 50 Gb.
But it's an average and if I got 400 Gb for just a day,
then I don't want to drop daily indices but hourly indices.
Stephen Siering told me that daily index is available in filebeat 5.0+
So what is the correct syntax to make it works ?

geekpete · December 8, 2016, 9:16am

You might find some benefit using the Rollover Index API:
https://www.elastic.co/guide/en/elasticsearch/reference/5.0/indices-rollover-index.html

This will allow you to do weekly indices but if there's an unexpected burst of events, the indices will automatically roll over after hitting a specific max size.

You really want to avoid having an unnecessary amount of shards in your cluster as will cause an overhead and impact your cluster health and performance at some point. Two causes of this are daily indices for clusters that would better be suited to weekly or other rollover frequency and configuring indices to use far too many shards by default, often a combination of these two factors.

lolo67 · December 8, 2016, 1:18pm

Thanks Peter.
I followed your advice and used the rollover api :
Today filebeat index is name filebeat-2016.12.08.
So I want it to rollover each hour or for testing let's say every 1000 docs :

curl -XPOST 'localhost:9200/filebeat/_rollover/filebeat-2016.12.08-1?pretty' -d'
 {
   "conditions": {
     "max_age":   "1h",
     "max_docs":  1000
   }
 }'

{
  "old_index" : "filebeat-2016.12.08",
  "new_index" : "filebeat-2016.12.08-1",
  "rolled_over" : true,
  "dry_run" : false,
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "conditions" : {
    "[max_age: 1h]" : true,
    "[max_docs: 1000]" : true
  }
}

OK now filebeat-2016.12.08-1 is created because of the rollover :

curl -XGET 'http://localhost:9200/_cat/indices?v'
health status index                 uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   filebeat-2016.12.08   qUduxdndTcWwYqBcdDH_lA   5   1      46053            0     13.8mb         13.8mb
yellow open   filebeat-2016.12.08-1 yYSVbfufQROtXjL0aW9rYw   5   1          0            0       650b           650b

But first problem, logstash is still sending logs to filebeat-2016.12.08 and not to filebeat-2016.12.08-1
And secondly, 1 hour later or 1000 docs later, no filebeat-2016.12.08-2 is created.

As I understand, rollover is for an index alias and not for an index.
How can I tell Filebeat to put logs into the alias index and not the index ?

steffens · December 8, 2016, 1:47pm

Oooohhh... I'm seeing you try to set the ES index on logstash output? This doesn't work, as index option in filebeat outputs is not supported by logstash output. It's logstash where you have to set the index name.

lolo67 · December 8, 2016, 1:59pm

In /etc/logstash/conf.d/30-output.conf there is :

output {
  elasticsearch {
    hosts => ["localhost"]
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

I replaced index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
with index => "filebeat-%{+YYYY.MM.dd.HH}"

After restarted logstash and deleting existing indices, now it works.

Thank you Steffen.

Best regards
Laurent

system · January 5, 2017, 1:59pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Indexes with FileBeat Beats filebeat	8	4429	September 27, 2016
Filebeat create new index at every x hours Beats filebeat	1	320	July 24, 2023
Create a new index everyday with filebeat Beats filebeat	3	2885	February 5, 2020
Indexing beats hourly Beats filebeat , packetbeat , winlogbeat , auditbeat	2	375	October 13, 2021
Hourly Shards Elasticsearch/Kibana Elasticsearch	6	2012	July 6, 2017

Hourly index for filebeat

Related topics