I posted this on IRC, but obviously my GMT+11 Timezone is not friendly, so
as a backup I post the text here for anyone that might have experience in
this:
I have an application using a TransportClient configured to connect to a
2-node ES cluster (i'll leave aside for now why we have to use the
TransportClient, but it's rationale..)
one of the ES nodes hand a faulty backplane and died.
ES of course kept on trucking with the other node
however since that event the application client has burnt a hell of a lot
of CPU
which looking at the thread dumps look to be the "New I/O client worker #1-5 daemon" style threads used by ES.
I thought somehow with the one ES node dead there's some looping logic
trying to re-establish connection to it.
so I waited till the Dell guys replaced the backplane and we restored that
node
once back in green state I was hoping the CPU burn would go away, but alas
no.
now looking at one of our other instances running in a similar config, I
note the ES app threads are always runnable because of the NIO, but they're
generally in a sleep state looking at them.
has anyone else seen this sort of problem?
I'm just gathering a known 'good' thread dump to compare this with.
I posted this on IRC, but obviously my GMT+11 Timezone is not friendly, so
as a backup I post the text here for anyone that might have experience in
this:
I have an application using a TransportClient configured to connect to a
2-node ES cluster (i'll leave aside for now why we have to use the
TransportClient, but it's rationale..)
one of the ES nodes hand a faulty backplane and died.
ES of course kept on trucking with the other node
however since that event the application client has burnt a hell of a lot
of CPU
which looking at the thread dumps look to be the "New I/O client worker #1-5 daemon" style threads used by ES.
I thought somehow with the one ES node dead there's some looping logic
trying to re-establish connection to it.
so I waited till the Dell guys replaced the backplane and we restored that
node
once back in green state I was hoping the CPU burn would go away, but alas
no.
now looking at one of our other instances running in a similar config, I
note the ES app threads are always runnable because of the NIO, but they're
generally in a sleep state looking at them.
has anyone else seen this sort of problem?
I'm just gathering a known 'good' thread dump to compare this with.
Oh geez bad form by me not quoting the version. Yes. 0.17.9 is what we're
using. I'm planning on upgrading to 0.18.x in the next month so that's good
news.
Thanks Shay.
On Thursday, 8 December 2011, Shay Banon kimchy@gmail.com wrote:
I posted this on IRC, but obviously my GMT+11 Timezone is not friendly,
so as a backup I post the text here for anyone that might have experience
in this:
I have an application using a TransportClient configured to connect to a
2-node ES cluster (i'll leave aside for now why we have to use the
TransportClient, but it's rationale..)
one of the ES nodes hand a faulty backplane and died.
ES of course kept on trucking with the other node
however since that event the application client has burnt a hell of a
lot of CPU
which looking at the thread dumps look to be the "New I/O client worker #1-5 daemon" style threads used by ES.
I thought somehow with the one ES node dead there's some looping logic
trying to re-establish connection to it.
so I waited till the Dell guys replaced the backplane and we restored
that node
once back in green state I was hoping the CPU burn would go away, but
alas no.
now looking at one of our other instances running in a similar config, I
note the ES app threads are always runnable because of the NIO, but they're
generally in a sleep state looking at them.
has anyone else seen this sort of problem?
I'm just gathering a known 'good' thread dump to compare this with.
Here's a gist: https://gist.github.com/1440329
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.