Basic understating regarding index creation

Hi Team,

I am facing issue where I can see index has not got created after some changes in logstash pipeline file.

Can you please help me with below questions to understand. I am finding difficulty in getting it.

I have simple architecture, filebeat will look for log files and will send to logstash and logstash will send to es.

  1. If there are daily log files getting created (e.g log1, log2 etc..) will new indices also gets created daily? (e.g index1, index2 etc..) or there will be one index getting bigger in size day by day. What makes indices to create daily same like log files getting created daily

  2. Is there any particular time in a day when es creates new index daily?

  3. What if one day, log file is not created (due to any issue), then new index also will not get created for that day?

and then if log file gets created the next day, will automatically index also gets created or we have to do something to get it created.

  1. Say I have 30 files which are indexed into es. If my elasticsearch data (due to any issue) gets deleted, corrupted etc.. and then if i deploy es again then it will again indexed those files then I am trying to understand where is my loss here. unless and until I have source log files I can get it indexed by deploying es again and again (not thinking about any saved dashboard etc. which anyway will be lost) so I can safely delete es. Is there anything I am missing or not understanding any disadvantages here by doing so.

means even if something happens to es, until I have the source log files, I can get them indexed and run query on them by deploying es again so I can purposely or accidently delete elasticsearch and need not to worry as I have those log files intact.

I am just trying to understand, how much risk it will be if es gets deleted but I have the log files and my purpose is to view those logs in kibana and run query on them.

Thanks,

You need to review the documentation. In older versions, the default was to create a new index each day. In current versions that support rollover, the default is to create a single index that is rolled over when it reaches either 50 gigabytes in size, or is 30 days old, whichever happens first.

If you are using daily indexes they roll to a new index at midnight UTC.

If there is no log for a day, so no events for that day, then no daily index will be created.

Hi @badger,

Thank you for your reply.

I am going through your provided link.

I can see operation_mode as Running in while checking ilm status.

I think for filebeat ilm is not enabled but still I can see daily filebeat indices are getting created.

"filebeat-7.4.0-2021.07.01" : {
  "index" : "filebeat-7.4.0-2021.07.01",
  "managed" : false
},
"filebeat-7.4.0-2021.07.08" : {
  "index" : "filebeat-7.4.0-2021.07.08",
  "managed" : false

Can you tell me what is triggering to create daily filebeat indices and where to check that setting?

also can you please reply on above 2nd part of 3 Q. and
4th Q. please. I have modified it bit.

Sorry for the trouble.

If filebeat is writing directly to elasticsearch the default value for index is "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}", so you will get daily indexes.

In my case filebeat is sending to logstash,

output.logstash:
  hosts: ['11.1.1.50:5044']

and in logstash pipeline config file, as you know its creating index name based on type i.e in this case, access_server. that is why we can see access_server index getting created daily.

 if [log_type] == "access_server" {
  elasticsearch {
    hosts => ['http://11.1.1.50:9200']
    index => "%{type}-%{+YYYY.MM.dd}"
        user => elastic
    password => x
      }
 }

at the end of pipeline config file, I have below. It looks es is creating filebeat index daily due to below config.

elasticsearch {
    hosts => ['http://11.1.1.50:9200']
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
    user => elastic
    password => x
  }
}

I can see filebeat, access_server and .monitoring-logstash indices getting created daily.

filebeat-7.4.0-2021.06.30
filebeat-7.4.0-2021.07.04
filebeat-7.4.0-2021.07.03
filebeat-7.4.0-2021.07.02
filebeat-7.4.0-2021.07.01
filebeat-7.4.0-2021.07.08
access_server-2021.07.09          
access_server-2021.07.06          
access_server-2021.07.07          
access_server-2021.07.04          
access_server-2021.07.05          
access_server-2021.07.02  
.monitoring-logstash-7-2021.07.07 
.monitoring-logstash-7-2021.07.08
.monitoring-logstash-7-2021.07.09 
.monitoring-logstash-7-2021.07.03
.monitoring-logstash-7-2021.07.04

it is still not clear, due to what config they are getting create daily? is it due to %{+YYYY.MM.dd} this part indices getting created daily?

Yes, it is the date formatted (%{+YYYY.MM.dd}) portion of the index name that is making it a daily file since those values change every day at midnight server time.
If you left the 'dd' part out (%{+YYYY.MM}) then the indexes would be monthly instead of daily.

They change at midnight UTC.

Thanks @bshaw

:+1:

Any reply on 4th q is appreciate , of course I am not going to take that reply as official or no one is going to be accountable for it, its just for knowledge.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.