I'm running a test environment with one node only with elasticsearch and generating indexes with the patern prefix-[type]-yyyy.mm.dd.
Information is generated in logfiles, that are sent with filebeat to logstash to process.
There are different values for [type] so each day i get several indexes.
Todavy i realized that i did not have any information in elasticsearch since yerterday, and looking in logstash logfiles i realized when it send the information to elasticsearch, the later had found max number of shards and where not able to create a new shard.
Then i increased the value of cluster.max_shards_per_node and new information coming from filebeat got indexed, however the logs that failed to be indexed during the night where not indexed.
Isnt it supposed that logstash retry if it cannot reach the output? or as in this case the output was reachable but elasticsearch returned error then logstash does not try to index the information again?
If elasticsearch was non-responsive I would expect logstash to queue. If elasticsearch returned an error then you would lose events (unless you have a DLQ configured and it is a retryable error).
Having large number of small shards is inefficient and can cause performance and stability problems. I would recommend reading this blog post and look to reduce the number of shards in the cluster, e.g. by switching from daily to weekly or monthly indices. The limit is there for a reason and set quite high, so just increasing it is not a great solution.
Thanks Christian, I have read that document while researching on the original problem and i'm changing the way indexes are being generated to have them monthly, after i removed some indexes i returned the parameter back to 1000.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.