I was wondering which is greatest way to store logs data, store into only one indice ( maybe use ILM to manager this indice ) or split in small indices (like: logs-2021.07.01, logs-2021.07.02 etc)
What the greatest way, which one is recommend ?
There's any impact in performance if i choose the big one indice option?
Considering that elasticsearch has a limited number of shards (where the number are limited by number of nodes, right ? )
Thanks
If you have log data you generally want to use time-based indices as this makes managing retention by deleting indices very efficient. Deleting from a single large index is quite costly. That said you want to make sure that your shards are not too small (aim for a few tens of GB in size as a guideline) so maybe you want to use rollover or have weekly or monthly indices depending on how much data you ingest per day.
I have a large number of documents per index (like 6kk per index)
Just for context:
Today i'm saving my logs into small indices, splitted by day, But this are consuming my shards limitation,
Can i save this logs into weekly indices instead of save it into daily indices (saving my shards limitation ) ?
And maybe use ILM to retention old indices ?
What do you recommend ?
Use time based indices but do not use daily indices if they end up being small. Instead use ILM with rollover to get to a good target size and have each index cover a longer time period.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.