We have been running under constant memory pressure on our ES nodes. Upon analyzing it appears that segment is consuming more than 50% of the available heap. Is there a way to configure the amount of memory that can be used for caching segment data?
We though about reducing shards, but then that would lead to very large shards, our index is about 1TB. Is there a better practice / strategy to use in such cases?
Data is quite heterogeneous - but largely its e-commerce / retail orders including customer requests with fields such as array of comments, phone number, email id, dates etc. We also have international customers and so some non-english characters (largely east european) in there. I checked the .tim files on the disk and they added up to 40gb.
For shard size - i thought 40 gb was crossing the "reasonable" limit as suggested in a few online posts. Is there such a limit for a shard where query latency and / or shard replication / relocation is severely hampered?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.