I have observed that elastic search defaults the search thread pool to 3 X #of CPUs and even if you increase this to a fix # it does not really help
as the threads start sharing the CPU cycles.
Does this mean that to get same performance results for more concurrent searches
I either have to scale vertically by adding more CPU cores or horizontally
by increasing the nodes and replica's to eliviate the search times by shard
to replica shards?
Here is my situation:
I have a 6 node ES cluster with 6 shards storing a total of 60 million
documents (around 2kb each) and index size of around 32 GB. Each node is a
VM with 4 vCPUs and 8 GB allocated to ES cluster. When I have 15 concurrent users
the response is around 270 ms and I see that all 12 threads in the search
pool are busy. If I increase the number of concurrent users for the search
the response time keeps going higher and higher and there is more requests
pending in the search queue. I even increased the thread pool configuration
but it really did not help.
Is there a template which certifies the TPS and # of users in relation to performance and scaling that I could refer to. May be some bench marks or common practices? I understand that it varies by index types and nature of searches. But wanted to see if there are general guidelines for scaling and certification on TPS/number of concurrent users.
Scaling horizontally (after you have tuned the existing nodes correctly) is
one of the design principles of Elasticsearch, yes.
You can actually increase the search throughput, even if you do not want to
increase disk requirements or increase the number of replicas, by adding
new Elasticsearch nodes or processes (maybe even on the same node?) with
"node.data = false", so they only serve as search nodes?
If that is not what you want, either asking someone to tune your current
nodes, or adding new nodes is the way forward
Is there a template which certifies the TPS and # of users in relation to
performance and scaling that I could refer to. May be some bench marks or
common practices? I understand that it varies by index types and nature of
searches. But wanted to see if there are general guidelines for scaling and
certification on TPS/number of concurrent users.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.