I'm running a cluster consisting of one load-balancing node and two
data-nodes. Currently I'm trying to fill the cluster with a lot of data,
and for that we have a threaded Java program, using TransportClients to
index data on port 9300. This program is run with 8 threads trying to index
data at the same time. We have then run another 8 threads from another
computer, starting to index data in the cluster. We then see response times
increase / performance decrease quite a lot. My first questions:
Are there limitations on number of transport clients to connect to a
cluster (on port 9300)? Is there any configuration I should consider? In
paramedic I http.current_open hold a lot of connections. Is there a limit
there?
If you connect transport clients to the cluster, you should connect to
all of the available data nodes for better network load distribution.
Note, a TransportClient can use many connections, it is not a 1:1
connection model. A client uses netty-based connection and thread pools
which are scalable into the ten- or even hundredthousands of
simultaneios connections. There is no limit of connections on port 9300.
If you observe slowdowns, it may be your setup, or your node
activity/capacity. If you observe many open files/sockets, it may be
that you are not using persistent/ reuable socket connections, or you
are opening/closing sockets too fast which is not required with ES.
Jörg
Am 11.04.13 13:21, schrieb Per Ekman:
Hi
I'm running a cluster consisting of one load-balancing node and two
data-nodes. Currently I'm trying to fill the cluster with a lot of
data, and for that we have a threaded Java program, using
TransportClients to index data on port 9300. This program is run with
8 threads trying to index data at the same time. We have then run
another 8 threads from another computer, starting to index data in the
cluster. We then see response times increase / performance decrease
quite a lot. My first questions:
Are there limitations on number of transport clients to connect to a
cluster (on port 9300)? Is there any configuration I should consider?
In paramedic I http.current_open hold a lot of connections. Is there a
limit there?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.