Hi,
Wanted to share with ES users a nice little story about using ES on AWS, with a very important note at the end for anybody running ES on AWS:
An ES user (a really cool company, which are doing great stuff with ES, if the company is listening, I think it would be great if you want to share the uses of ES ) was having problems with ES when doing load testing.
Every once in a while, ES would freeze up for a few minutes, and then "release". This usually can happen for several reasons, but looking at the logs we saw log messages similar of ES logging a GC (Garbage Collector) that took several minutes on ParNew.
Having several minutes GC collection on the ParNew generation of the JVM is very dubious. This can happen for several reasons, the ES process being swapped (which is really bad JVM based processes), or a possible bug in the JVM (there are several listed as fixed through the JVM different versions).
I would not go through all the things we tried in order to fix it, but the problem ended up being Ubuntu 10.04. Once we upgraded to Ubuntu 10.10, the problem was solved.
I don't know what change in Ubuntu fixed the problem. What I do know is that people were running 10.04 for a long time with no problem on AWS, which leads me to suspect that AWS changed something (like changing their Xen version) that triggered this problem. Hopefully, now with them providing the Elastic Beanstalk, they will start testing more properly the implications internal upgrades they do on the JVM.
Funnily enough, I started to hear it from several users around this time, and with all of them, upgrading to Ubuntu 10.10 solved the problem.
So, if you are running Ubuntu 10.04 on AWS, make sure you upgrade to 10.10.
-shay.banon