Connection pool

Hi, is there any configuration for connection pool in elastic ?,
we are just moving from oracle to elastic.
it seems that we get low performance as we increase the client threads.

Could you please provide some additional information about the set-up and configuration of your cluster as well as how you are interacting with the cluster? This would help us troubleshoot any issues you are having.

We use the basic Elastic configuration with the following changes:
index.merge.scheduler.max_thread_count: 1 (as we don’t use SSD)
heap zise = 20GB
mlockall=true

We created a new index with 8 shards and 1 replica. We indexed 50M documents and started running JMeter REST API calls with filters. JMeter is running with 10 threads (that’s the optimized number we found for this test). We notice that with 2 nodes we get about 450 calls per sec with 3 nodes we get 600 calls per sec but when we add a 4th node the throughput remains at 600 per second and doesn’t increase.

The JMeter client doesn’t reach its memory / CPU / Network limits at all.

Are you sending queries to all of the nodes? When you add nodes, have to tested increasing the number of threads?

If the application is query intensive as in your benchmark, it may, depending on the size of your data, help to increase the number of replicas so there are more shards available that can serve data. If you have a small data set that can fit in memory, you even want to go as far as setting the number of replicas so that all nodes hold all the data.

do u have configured query node? That will load balance the requests to data nodes.. given shards allocation is balanced

Yeah make sure JMeter is load balancing to all the nodes.

To do that just simply create a CSV data set with list of your nodes. So...

host1,9200
host2,9200
host3,9200
host4,9200

Under variable name of csv data set put something like: host,port
Starting mode should be: All Threads

And then in your HTTP sampler just reference the variables ${host} and ${port}. Each thread will cycle through the CSV data set and use a different host per request.

no, we only send to the master node.
yes we increase the number of threads but not getting more throughput.

how do i configure the query node ?

If you have a dedicated master node, it should be left to manage the cluster, and should not serve traffic. You should as suggested set up your JMeter to connect and send requests to all data nodes directly.

Ok. did that,
i have now very high load average in all nodes, twice the number of cores (8 cores)
please advice,
maybe it is cache setting ?

query node is the one where is data and is master is both set to false.

client nodes are smart load balancers that take part in some of the processing steps. Lets take an example:

We can start a whole cluster of data nodes which do not even start an HTTP transport by setting http.enabled to false. Such nodes will communicate with one another using the transport module. In front of the cluster we can start one or more "client" nodes which will start with HTTP enabled. These client nodes will have the settings node.data: false and node.master: false. All HTTP communication will be performed through these client nodes.

These "client" nodes are still part of the cluster, and they can redirect operations exactly to the node that holds the relevant data without having to query all nodes. However, they do not store data and also do not perform cluster management operations. The other benefit is the fact that for scatter / gather based operations (such as search), since the client nodes will start the scatter process, they will perform the actual gather processing. This relieves the data nodes to do the heavy duty of indexing and searching, without needing to process HTTP requests (parsing), overload the network, or perform the gather processing.

Ok, thanks i will do it
will it resolve the high io wait we expirience ?