I am having the exact same problem at the moment. So far it has been going very well with ES.
But since yesterday I use a cleanup script which basically uses the 'more like this' query to find all duplicate entries diving through all documents in descending id order (btw: is there a better way?).
After 30.000-50.000 docs the same problem as you described happens: no more http request on this machine possible (http request timeouts) but cluster green status.
My gut feeling is that it has to do with the number of open file descriptors. I already have set these to 32000, but ulimit -a shows still the Ubuntu 10.04 standard of 1024, although bigdesk shows the value 32000.
This would also explain why the cluster state is still green. To my understanding tthe ES servers use persisting connections between them, so they would only hit the problem if they created new index files or rebalance the shards, right ?
This problem is reproducable. It needs a time, but it happens even on my Development Macbook Pro. I am using ES 0.19.3
Am 14.06.2012 um 23:29 schrieb Shay Banon:
Yes, assuming it was not responding to pings from the master as well. It takes time though, based on the ping timeouts settings. They are pretty conservative.
On Wed, Jun 13, 2012 at 11:55 AM, Davie Moston email@example.com wrote:
We recently had a situation where one of our nodes was hanging, http requests to 9200 did not return, and indexing/search requests were timing out.
I'm still looking into the cause of this, but I'm slightly surprised that the cluster state stayed green.
When I shutdown the hung node the other node processed the requests fine.
Presumably this is a bug - that node should have been disconnected from the cluster right?