Ive setup a new beat->logstash->elastic system and have started collecting data from a number of sources (files, metrics, etc). Looking through the indexes available via Kibana Im seeing nearly a hundred entries marked 'index-' (EG "filebeat-2018.08.10") (same for "metricbeat").
Is this by design? Do I want / need a new index every day?
Looking through 'filebeat.reference.yml' it appears that these are generated by elasticsearch. I don't send data directly there but rather through logstash... Or do I have this configured incorrectly? I have the elasticsearch entries in 'filebeat.yml' commented out. There's certainly data being sent in so I assume the 'logstash' entries are at least accurate.
Lots of spurious entries are appearing in the indexes (EG "#033.31;1mbuilds#033.0;m.keyword")
--> where are these coming from and how do I eliminate them?
By default the elasticsearch output filter will put its documents in the logstash-%{+YYYY.MM.dd} Docs Link
This is done on purpose in order to help you clean up old data. When you create an index pattern inside Kibana you can then define your dashboard to use logstash-* pattern so that it picks up all of the indexes.
That sounds very much like beats is shipping directly to Elasticsearch. What does e.g. your filebeat output config look like?
Looking through 'filebeat.reference.yml' it appears that these are generated by elasticsearch.
Elasticsearch indices are created by beats or Logstash
Do I want / need a new index every day?
Having daily indices, at least for logs, help with house keeping. It is much easier to delete whole indices based on time than it is to delete documents from indices based on time. Usually for logs you set some sort of retention period unless you have unlimited storage capacity.
You could go less granular and have weekly indices as well, especially if your log volume is low.
filebeat.inputs:
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /var/log/*.log
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
exclude_files: ['.gz$']
#============================= Filebeat modules ===============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
#==================== Elasticsearch template setting ==========================
setup.template.settings:
index.number_of_shards: 3
#================================ Outputs =====================================
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["logstash.otech:5044"]
I use puppet to push out this same config to all hosts sending logs.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.