I'm new to ES and I have an ES 1.5.2 cluster with:
- 4 nodes.
- Each node currently is an amazon ec2 (c4.2xlarge) instance.
- 16 GB RAM
- 8 cores.
- 8 GB RAM assigned to HEAP
- 7.9 billion documents
- 2656 shards
- 6.57 TB of data
- all indices are separate into corresponding days
- Using logstash to bulk insert data
- Very heavy WRITE and less read. Mostly Analytics/Clickstream/Log data
- I've used doc_types wherever I could, though possible I've missed some
Heap Usage is very close to 100%, seems to get to 100% and then drop a little bit and then climbs back to 100%. I suspect garbage collection is constantly running. Leave me little to know heap to use for anything else. Even if I enlarge the heap to say 15GB it still does the same thing and will grow towards 15GB and then cleanup a bit and back to 15GB.
The filter cache only has about 500mb in it.
What is using so much of the heap? Does it simply take that much because of the amount of documents? I feel like I'm missing some understanding here. Let me know what statistics you need and I'll post them. Thanks so much.