I'm experiencing performance challenges with an ES cluster. The cluster consists of 6 data nodes (3 hot w/ ~30GB heap, 3 warm w/ ~15GB heap). The cluster holds around 4TB of data total, with a total of 20,000,000,000 (20 billion) docs. Indexes are 6 shards each, where each shard holds 25-35GB of data.
The overwhelming majority of the docs are very small, effectively key-value pairs. Those small docs are nested docs, but still indexed as documents in Lucene. I have read on numerous occasions that optimal shard size is "a few GB to a few tens of GB" or "less than but close to 50GB".
I see less info about the ideal number of documents per shard. I know that Lucene (i.e. a single shard) has an upper limit of 2 billion docs. But does the performance really not change regardless of the number of docs? My empirical findings would suggest that number of docs does in fact matter. I see that
search threads consistently appear in hot threads, even if no indexing is happening. I can also see that memory usage and CPU usages are low (CPU @ 20%, mem @ 50%).
Could this indicate that for my use case, splitting the indexes into more, smaller shards might yield improved search latency? Under what circumstances might one assume that the number of shards in use is too small?