These information are collected with SAR tool. I checked the stats at master and data i don't see any bottle neck.
I only see the bottle neck at rally side i.e IO bottle neck for half indexing duration. The storage for esrally is NVMe with 32000 IOPS. and data side i am using memory as storage. I haven't seen this issue with 4x and 8x instance.
By indexing to 2 shards on a single node, you can expect the cluster to have at most 2 concurrent write threads allocated to indexing. With 64 vCPUs, the write thread pool is automatically sized at 64. To saturate the cluster:
Set the number of shards to 64.
Set the number of indexing clients to 2x the number of node CPU cores, 128 in this case.
You should expect a big difference in CPU usage after making these changes.
How Hyperthreading play role in indexing performance? As i see the number of processors is taken as the vCPUs but not the physical core.
The bulk thread pool is sized to the number of vCPUs . With Hyperthreading on each core has two threads enabled and doubling the indexing client will further have 2 concurrent indexing thread mapped to one write thread. did i get it right?
The JVM based getRunTimeProcessor() take the processors count from container. If the resources have only request for cpu defined in yaml the number of processor will be taken as resources.requests.cpu and if limit is defined than number of processor will be taken as resources.limit.cpu. is it correct as per my understanding?
if i have 3 data node, with 1 master and 1 esrally each with 8vCPUs then i will have 8*3= 24 vCPUs for data nodes each with 8 write thread. So in this case How shards will be distributed across each data node if i take 8 shards for each data node in order to saturate cluster. Kindly give some idea on how to approach in case of multi data nodes with number of shards , Bulk indexing client
How Hyperthreading play role in indexing performance? As i see the number of processors is taken as the vCPUs but not the physical core.
The role hyperthreading plays in indexing performance should be evaluated on a case-by-case basis. You can generally expect a hyperthreaded core not to perform as well as a physical core, an aspect that might only matter if the node is CPU bound.
The bulk thread pool is sized to the number of vCPUs . With Hyperthreading on each core has two threads enabled and doubling the indexing client will further have 2 concurrent indexing thread mapped to one write thread. did i get it right?
Doubling the number of indexing clients is something we do with Rally for saturating the target's write thread pool. ES will use a vCPU, as it is presented by the system, per Java thread. Hyperthreading is only a factor as it relates to whatever performance impact is induced by the hyperthreading feature itself.
The JVM based getRunTimeProcessor() take the processors count from container. If the resources have only request for cpu defined in yaml the number of processor will be taken as resources.requests.cpu and if limit is defined than number of processor will be taken as resources.limit.cpu. is it correct as per my understanding?
The number of processors will be whatever the container sees. Limits will be invoked through kernel cgroups and any time a container reaches its allocated limit, then the process will be forced to wait. We typically refer to this behavior as CPU throttling.
if i have 3 data node, with 1 master and 1 esrally each with 8vCPUs then i will have 8*3= 24 vCPUs for data nodes each with 8 write thread. So in this case How shards will be distributed across each data node if i take 8 shards for each data node in order to saturate cluster. Kindly give some idea on how to approach in case of multi data nodes with number of shards , Bulk indexing client
This is why we benchmark. Replicas need to factor in for a multi-node cluster. Each replica performs the same indexing functions as their primaries, therefore, it might only make sense for a 12 primary, 1 replica scheme in this scenario. It very much depends on the number of indices and search behavior. With 12 primaries, you may want to keep with 2*#vCPUs given that the cluster may end up placing more than 4 primaries on a single node. The nodes will use the write queue as needed and we want to keep those queues >1 if we are to keep the cluster saturated.
Q. Does this IO bottle neck on client side for approx half indexing duration has any impact on indexing performance?
with further increase in bulk size to 2000
i get below warning
[[WARNING] No throughput metrics available for [index-append]. Likely cause: The benchmark ended already during warmup](https://discuss.elastic.co/t/warning-no-throughput-metrics-available-for-index-append-likely-cause-the-benchmark-ended-already-during-warmup/306935)
I wonder if what you are seeing on the client side is the data generation and warm-up phase of the benchmark, though there is some overlap. You want to see a saturated write thread pool cluster side. With all the storage in memory, indexing is going to be very fast. You might even need to go beyond a 10000 bulk size along with expanding your dataset in order to collect metrics.
I am able to saturate
4xLarge with 1000 bulk size.
8xLarge with 2000 bulk size.
16xLarge :- not able to saturate as i move beyond 1400 BS i get below warning.
[[WARNING] No throughput metrics available for [index-append]. Likely cause: The benchmark ended already during warmup]
Reason:- Indexing finishes before warmup time.
Q1. Does the IO bottle neck on client side for approx. half indexing duration has any impact on indexing performance?
There is some overlap for indexing duration as well. Does this have any impact on indexing performance. As client side i have NVMe storage.
Q2. for 16xlarge I am getting no metrics warning with 0 error rate, How to overcome no metric warning?
I went through the previous issues reported. I figured out people suggesting to make warmup time 0. Does this have impact on performance result? Any other work around for this issue.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.