Noob index management - naming / cleanup

ethrbunny · September 11, 2018, 1:41am

Ive setup a new beat->logstash->elastic system and have started collecting data from a number of sources (files, metrics, etc). Looking through the indexes available via Kibana Im seeing nearly a hundred entries marked 'index-' (EG "filebeat-2018.08.10") (same for "metricbeat").

Is this by design? Do I want / need a new index every day?

Looking through 'filebeat.reference.yml' it appears that these are generated by elasticsearch. I don't send data directly there but rather through logstash... Or do I have this configured incorrectly? I have the elasticsearch entries in 'filebeat.yml' commented out. There's certainly data being sent in so I assume the 'logstash' entries are at least accurate.

Lots of spurious entries are appearing in the indexes (EG "#033.31;1mbuilds#033.0;m.keyword")
--> where are these coming from and how do I eliminate them?

AquaX · September 11, 2018, 1:48am

By default the elasticsearch output filter will put its documents in the logstash-%{+YYYY.MM.dd} Docs Link
This is done on purpose in order to help you clean up old data. When you create an index pattern inside Kibana you can then define your dashboard to use logstash-* pattern so that it picks up all of the indexes.

ethrbunny · September 11, 2018, 11:18am

Im confused as Im sending all my data via logstash. Or are you referring to the logstash->elastic output?

AquaX · September 11, 2018, 2:54pm

Yes, sorry.
I meant:
output {
elasticsearch {}
}

This will automatically generate the index name logstash-%YYYYMMdd unless you define another index name

A_B · September 11, 2018, 3:35pm

That sounds very much like beats is shipping directly to Elasticsearch. What does e.g. your filebeat output config look like?

Looking through 'filebeat.reference.yml' it appears that these are generated by elasticsearch.

Elasticsearch indices are created by beats or Logstash

Do I want / need a new index every day?

Having daily indices, at least for logs, help with house keeping. It is much easier to delete whole indices based on time than it is to delete documents from indices based on time. Usually for logs you set some sort of retention period unless you have unlimited storage capacity.

You could go less granular and have weekly indices as well, especially if your log volume is low.

ethrbunny · September 11, 2018, 4:13pm

My config:

filebeat.inputs:

- type: log

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/*.log
  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  exclude_files: ['.gz$']

#============================= Filebeat modules ===============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

#==================== Elasticsearch template setting ==========================

setup.template.settings:
index.number_of_shards: 3

#================================ Outputs =====================================

#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["logstash.otech:5044"]

I use puppet to push out this same config to all hosts sending logs.

A_B · September 12, 2018, 7:07am

Have you restarted the filebeat service? With reload.enabled: false you will have to do that manually.

ethrbunny · September 12, 2018, 10:10am

Yes. I believe the indexes are being generated by logstash though:
(from /etc/logstash/conf.d/beats.conf)('input' and 'filter' sections removed)

output {
  elasticsearch {
    hosts => "10.xx.xx.xx:9200"
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
  }
}

system · October 10, 2018, 10:11am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch filebeat index name Beats filebeat	2	403	May 15, 2018
Index names for beats flowing through logstash Logstash	2	286	July 5, 2018
Problem creating index Beats filebeat	7	577	May 25, 2020
Logstash keep logging into default index Logstash	6	1131	July 6, 2017
Best practice for using file beats and indexing Beats filebeat	3	832	April 17, 2018

Noob index management - naming / cleanup

Related topics