ES 5.2.2, Ubuntu 16.04, Oracle JRE8
I have a 6x m4.large nodes, 120 M docs split into nearly 70 indices with 5 shards, 1 replica.
Querying is only done on latest data about 1 M docs using time filters.
This is a test cluster with no real traffic.
request_cache is only about 5-10 MB when the heap usage is at around 3 GB.
When I simulate a couple of queries https://gist.github.com/vanga/2cd8e1fd7c3b2bffa89fda8ce3a8a481, I see the increase in heap usage till the point old GC runs , heap comes back to normal after this, but with real traffic, having frequent old GCs is really making the cluster unusable.
This increase is only on the node I am making the request to. I have also force merged segments to max_num_segments=2.
Field data is not significant.
Any leads on how to debug this further would be helpful.