I'm trying to scale my indexing for the first time, and I'm running into
connections problems. I reach a scale where cURL connections from my
indexers start getting cURL7 errors ( connect failed ). It looks like ES
just stops accepting all HTTP connections for a period of time. I cannot
find the root cause.
I'm running on an Amazon C3.4XL. The processors are not maxed, memory is
not maxed, IO is not showing issues. I'm not seeing problems in the ES
log, but I'm not sure I have logging fully enabled. I've tried increasing
the thread_pool for the indexer, and that doesn't help. I'm not seeing any
rejected connections there. I'm at a loss.
The closest I can get is a guess using data from Bigdesk. When the number
HTTP channels starts exceeding the number of transport channels, I start to
see the problem emerge. I have no idea if this is related, but it's the
only metric I've traced that seems correlated.
are you using HTTP keep alive connections? If not consider switching to
them, as reopening a new TCP connection not only results in high latencies
but also removes file handle resources from the elasticsearch process (the
number of open files). if your client/language does not support this, you
should either use nginx or at least try to use bulk operations in order to
create less TCP connections, however this is just a workaround.
I'm trying to scale my indexing for the first time, and I'm running into
connections problems. I reach a scale where cURL connections from my
indexers start getting cURL7 errors ( connect failed ). It looks like ES
just stops accepting all HTTP connections for a period of time. I cannot
find the root cause.
I'm running on an Amazon C3.4XL. The processors are not maxed, memory is
not maxed, IO is not showing issues. I'm not seeing problems in the ES
log, but I'm not sure I have logging fully enabled. I've tried increasing
the thread_pool for the indexer, and that doesn't help. I'm not seeing any
rejected connections there. I'm at a loss.
The closest I can get is a guess using data from Bigdesk. When the number
HTTP channels starts exceeding the number of transport channels, I start to
see the problem emerge. I have no idea if this is related, but it's the
only metric I've traced that seems correlated.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.