I've got a daily index with a couple of high cardinality fields. Even after reading some threads and the documentation available, I'm not sure how should I proceed. The index has ~600M docs in it every day and the most requested HC field has ~20M unique values. The index has 6 shards on 8 nodes. I'm trying to lower the general memory usage and I'm not sure if adding eager global ordinals to the affected fields' mapping would help in any way or if I change the execution hint to map would do any harm. Unfortunately, the queries are aggregating those unique values to different term buckets almost every time.
Any suggestions?
If daily indices are no longer written to you can reduce memory usage by forcemerging them down to a single segment as described in this webinar. It can take a while and result in a lot of disk I/O though.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.