We have 3 nodes running as r3.2xlarge (61GB RAM), with EBS storage. We're regularly getting Failed node errors due to out of memory concerns. We are doing aggregations on documents that are significant size (HTML documents effectively), in an analytics, and there are sometimes between 300k and 500k documents.

Anyway, things are REALLY slow and our front end often times out on Heroku (30 second limit).

We need some guidance here. How can we reconfigure this to get better performance?


To add information.. CloudFront shows our system as idle when not doing queries (0% CPU on all servers), very slow (we have a couple of hours of daily loading, but otherwise it's mostly querying for analytics).

The entire index that we're querying has a size of 72 gig, with 18 million documents, split over 5 shards on 3 servers. As I said above, our normal query will be "using" 300-500k of them at the high end.

And, it often fails and actually errors with an out of memory error.

That's a pretty large aggregation on large docs, so not surprising you need heap there to help.

Are you using doc values as much as possible? What sort of parsing do you do when ingesting.

Sorry for the late reply. We're using docvalues everywhere. the field has about 10M unique terms. How would you go about of speeding this up?