I am facing issue where I can see index has not got created after some changes in logstash pipeline file.
Can you please help me with below questions to understand. I am finding difficulty in getting it.
I have simple architecture, filebeat will look for log files and will send to logstash and logstash will send to es.
If there are daily log files getting created (e.g log1, log2 etc..) will new indices also gets created daily? (e.g index1, index2 etc..) or there will be one index getting bigger in size day by day. What makes indices to create daily same like log files getting created daily
Is there any particular time in a day when es creates new index daily?
What if one day, log file is not created (due to any issue), then new index also will not get created for that day?
and then if log file gets created the next day, will automatically index also gets created or we have to do something to get it created.
Say I have 30 files which are indexed into es. If my elasticsearch data (due to any issue) gets deleted, corrupted etc.. and then if i deploy es again then it will again indexed those files then I am trying to understand where is my loss here. unless and until I have source log files I can get it indexed by deploying es again and again (not thinking about any saved dashboard etc. which anyway will be lost) so I can safely delete es. Is there anything I am missing or not understanding any disadvantages here by doing so.
means even if something happens to es, until I have the source log files, I can get them indexed and run query on them by deploying es again so I can purposely or accidently delete elasticsearch and need not to worry as I have those log files intact.
I am just trying to understand, how much risk it will be if es gets deleted but I have the log files and my purpose is to view those logs in kibana and run query on them.
You need to review the documentation. In older versions, the default was to create a new index each day. In current versions that support rollover, the default is to create a single index that is rolled over when it reaches either 50 gigabytes in size, or is 30 days old, whichever happens first.
If you are using daily indexes they roll to a new index at midnight UTC.
If there is no log for a day, so no events for that day, then no daily index will be created.
If filebeat is writing directly to elasticsearch the default value for index is "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}", so you will get daily indexes.
and in logstash pipeline config file, as you know its creating index name based on type i.e in this case, access_server. that is why we can see access_server index getting created daily.
if [log_type] == "access_server" {
elasticsearch {
hosts => ['http://11.1.1.50:9200']
index => "%{type}-%{+YYYY.MM.dd}"
user => elastic
password => x
}
}
at the end of pipeline config file, I have below. It looks es is creating filebeat index daily due to below config.
elasticsearch {
hosts => ['http://11.1.1.50:9200']
index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
user => elastic
password => x
}
}
I can see filebeat, access_server and .monitoring-logstash indices getting created daily.
Yes, it is the date formatted (%{+YYYY.MM.dd}) portion of the index name that is making it a daily file since those values change every day at midnight server time.
If you left the 'dd' part out (%{+YYYY.MM}) then the indexes would be monthly instead of daily.
Any reply on 4th q is appreciate , of course I am not going to take that reply as official or no one is going to be accountable for it, its just for knowledge.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.