We are using Elasticsearch mainly as cache, for getting fast results. Currently, we have all the default settings which I think is 1 replica per shard.
Should I change this setting to 0? Are replicas just for data recovery or does it also affect performance/functionality?
We don't care much about data loss since we often use the curl -X DELETE http://elasticsearch:9200/_all command.
Hi @mahalabobis
Replicas can also.provide read parallelization which can positively affect read performance (not always though), you may or may not need that depending on you requirements.
Here are some strategies for search performance. You may not need any of these but I thought I would pass them on.
My main concern is to reduce the number of shards. We are having more than 600 shards per node. More often than not, it goes yellow with many unassigned shards. So I thought maybe removing replicas is potential solution
Yes if you can remove the replicas (also long as you stated you are OK without HA) also you may consider reducing the number of shards which may also help performance. You can the replicas to 0 in the index settings for all your indices then the cluster will be green.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.