Scalability issue - Rally benchmark on ES 7.0.1

Running Rally on existing ElasticSearch 7.0.1 cluster. Tried on three cluster sizes, 1 node cluster, 2 nodes clusters and 4 node clusters. I don't see the scalability improvement on adding nodes to cluster.

Rally:

  1. esrally on 1 physical node
  2. Track nyc_taxis, http_log
  3. Number of shards are 8:
  4. Bulk size =175000

Each physical node - 512 GB RAM, 48 vCores, 80G NIC. 4 virtual IPs mounts to FlashBlade 20TB.
Heap set to 31GB.

Cpu utilization is about 50%.
replication=default,
track_params=default
http_logs track: index-apend (docs/s)

| Ref Interval | 1 node | 2 nodes | 4 nodes |
|--------------+--------+---------+---------|
| 5s           | 236205 |  242617 |  255423 |
| 30s          | 244623 |  230566 |  229808 |
| -1s          | 231718 |  242934 |  252941 |

nyc_taxis:
refresh_interval=-1, replication=0
index-apend (docs/s)

| bulk_indexing_clients | shard_count | bulk_size | 1x node | 2x nodes |
|                    16 |           8 |    175000 |  104366 |   109475 |
|                    16 |          16 |    175000 |  100466 |   109114 |
|                    24 |           4 |    175000 |  132447 |   132356 |
|                    24 |           8 |    175000 |  131560 |   130902 |
|                    32 |          16 |    175000 |  124276 |   123692 |
|                    32 |          32 |    175000 |  124315 |   124315 |

cpu utilization is about 50% , not much disk IO or memory is used.
It did see these on the log:

 [nyc_taxis][0] now throttling indexing: numMergesInFlight=10, maxNumMerges=9
[2019-05-29T11:37:57,666][INFO ][o.e.i.e.I.EngineMergeScheduler] [sn1-r720-g02-17.abccd.com-0] [nyc_taxis][0] stop throttling indexing: numMergesInFlight=8, maxNumMerges=9

I noticed some the indices.store.throttle settings have been deprecated.

Hey @cks,

Thanks for your interest in Rally and Elasticsearch. I took the liberty of editing your comment for readability, mostly enclosing data sections using the </> icon.

A similar case has been discussed in this discuss topics and the troubleshooting workflow mentioned there would benefit you a lot, in my opinion; additionally this comment may be useful: Elasticsearch official benchmarking results.

To summarize, there seems to be a bottleneck somewhere.
Have you checked if Rally or the load driving machine gets saturated with your settings (175000 bulk_size and 16 or more indexing clients)? I'd especially look at network utilization followed by cpu and disk.

The throttling indexing message is interesting, do you remember in which scenario this happened? (i.e. single/2 nodes, # of indexing clients, and bulk_size)? Also which Java version are you using?

At some point it's likely you'll hit a limit using a single load driver. For example a 1Gbps link gets easily saturated with >1 nodes. This recipe should help you setup a distributed load test driver when you hit this point.

Dimitris

Hi Dimitris,
Thanks for you response and editing my post for readability.
In my observation, network utilization was about 20-30%. bond0 with two 40G nic cards on each physical machines. Cpu utilization was about 40-50%.
Throttling indexing message came while running on single node, with 24 clients, and 175000 bulksize. JDK is the same version that comes with Elastic. OpenJDK Runtime Environment (build 1.8.0_212-b04)
I did see about 30 esrallyd threads on rally node, however not much cpu activity.
Is there a way to setup profiler for indexing?

Hey @cks,

A few clarifications/inline comments below

Each physical node - 512 GB RAM, 48 vCores, 80G NIC. 4 virtual IPs mounts to FlashBlade 20TB.

Not overly familiar with FlashBlades but to clarify in broad terms, the 80Gbps bonded nic is, in the case of ES nodes, the funnel for all ES read/write IO plus all network traffic right? Similarly this nic on the Rally node is the pipe for all Rally IO+network traffic.
Also my understanding we are talking about file/object storage like NFS rather than block storage like iSCSI, right?

[nyc_taxis][0] now throttling indexing: numMergesInFlight=10, maxNumMerges=9

This message is really hinting at IO performance issues on the Elasticsearch side; the node holding shard 0 is falling behind in segment merges and Elasticsearch is throttling indexing.

So I'd take a look at the IO performance of the underlying network storage (see above question), from the OS PoV and from the storage device itself. For the OS analysis, I recommend the USE Method and the tools mentioned in this checklist.

JDK is the same version that comes with Elastic. OpenJDK Runtime Environment (build 1.8.0_212-b04)

This sounds strange because the Elasticsearch 7.0.1 distribution bundled with Java ships with OpenJDK 12.0.1, unless you explicitly define JAVA_HOME.

1 Like

the 80Gbps bonded nic is, in the case of ES nodes, the funnel for all ES read/write IO plus all network traffic right?

Yes. bond0 flat network (40G+40G).

Similarly this nic on the Rally node is the pipe for all Rally IO+network traffic.

Yes. on the same subnet as ES nodes and Flashblade. Also has bond0 80G. (40G+40G)

Also my understanding we are talking about file/object storage like NFS rather than block storage like iSCSI, right?

Yes. NFS v3 file storage.

JDK: I had previously set JAVA_HOME to point to openJdk 1.8 for Rally node. Then changed to the OpenJDK 12.0.1 to keep both ES and Rally nodes the same version and ran benchmarks. The numbers were in the same ballpark. For ES nodes, pointed JAVA_HOME to the ES shipped openjdk folder.

Will look into IO perf of network storage and OS analysis. Checklist is useful.

In addition to the throttle messages, I also see gc overhead messages.

[2019-05-31T12:43:09,147][INFO ][o.e.m.j.JvmGcMonitorService] [sn1-r720-g02-17.abc.com-0] [gc][41428] overhead, spent [360ms] collecting in the last [1.1s]
[2019-05-31T12:43:10,343][INFO ][o.e.m.j.JvmGcMonitorService] [sn1-r720-g02-17.abc.com-0] [gc][41429] overhead, spent [477ms] collecting in the last [1.1s]

I was able to find the cause of problem. It looks like track_param 'shard_count' is now 'number_of_shards' in rally 1.x. I was using 'shard_count:8' while invoking esrally, which did not change the number of shards in index created by rally, neither did it complain. After this I see scalability from 1 node to 2 nodes and 4 nodes with higher cpu utilization and higher 2x docs/s count.
Thank you.

1 Like

Hey Charles,

Glad to hear you figured this out. The issue you hit is an important one from a robustness and user friendliness point of view and we have identified it some time ago and came up with a fix in: https://github.com/elastic/rally/pull/688

Using the latest development version of Rally I used your wrong parameter and it failed with the following message:


...

Some of your track parameter(s) "shard_count" are not used by this track; perhaps you intend to use  instead.

All track parameters you provided are:
- shard_count

All parameters exposed by this track:
- bulk_indexing_clients
- bulk_size
- cluster_health
- conflict_probability
- index_settings
- ingest_percentage
- number_of_replicas
- number_of_shards
- on_conflict
- recency
- source_enabled
- store_type
[ERROR] Cannot race. Error in race control (("Unused track parameters ['shard_count'].", None))

You can already use this version of Rally directly for the GitHub repo Rally sources and of course this improvement will land in our next Rally release which always gets announced here: https://discuss.elastic.co/search?q=rally%20category%3A5%20order%3Alatest.

Thanks,
Dimitris

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.