Logstash pipeline index question

I think you should change the way you think about this. Daily indexes were useful for a retention policy like "I want 90 days of web server logs". The indexes would rollover daily and you could use a program like curator (or nowadays ILM) to delete any older than 90 days. In kibana you would use an index pattern to merge all 90 logs into a single searchable index.

Using daily indexes across a large number of index patterns will run you out of shards very quickly. If you have three versions of four different beats you will be using more than 10 indexes a day. If for some reason you have multiple shards per index (not the default since 7.x) then you could be using 25 indexes a day, meaning you run out after little more than a month.

Using the cat shards API should help you figure out what is happening. I understand you are in an environment where you cannot copy the output to show us.

With ILM you can implement a retention policy like "I want to keep the last 50 GB of syslog data". If you get 100 GB of syslog per day that will be 12 hours, if you get 10 MB per day that will be more than 10 years worth. You would configure logstash to write all the syslog data to an index called "syslog", then configure ILM to roll that over every 20 GB, and keep the last two rolled over indexes.

Similarly for the beats data. Instead of daily indexes for each beat you could try a single index called "beats". If that results in too many fields then you could use "%{[@metadata][beat]}". If that results in mapping conflicts because different versions of a beat use different field structures then try "beats-%{[metadata][version]}". You might even need "%{[@metadata][beat]}-%{[metadata][version]}", but take the date out of the index name and use ILM to determine the amount of disk space each index can use.