Distributed Requests from Rally

Hi -

I have slightly modified the percolator challenge to something like this:

...
"parallel": {
"tasks": [
{
"operation": "percolator_with_content_president_bush",
"clients": 100,
"warmup-iterations": 100,
"iterations": 100,
"target-throughput": 10000
}
]
...

At first I run rally through a load balancer with this command:

$ esrally --track-path=percolator/ --report-format=csv --report-file=reports/race.csv --target-hosts=internal-someelb.us-east-1.elb.amazonaws.com:9200 --pipeline=benchmark-only

There are about 12-48 data nodes to a load balancer, but when monitoring the data nodes, I noticed only 1 maybe 2 actually receiving requests from rally. Is this normal? At first I thought maybe my load balancer wasn't configured correctly. Then I decided to by-pass the lb by explicitly listing out each data node in the target-hosts flag. Surprisingly, I received the same result, only 1 or 2 data nodes receive the requests.

Is there a way to configure rally to distribute the requests evenly over all the target hosts or all the nodes via a lb?

Rally will be using long running connections so if you are using a load balancer it will depend on how this routs connections when rally is starting up and connections established. If you provide a long list of ups instead I am not sure any coordination is done between the clients, so would be less surprised if this led to an imbalance. I bet @danielmitterdorfer can clarify though.

Thanks Christian. I think my issue is the default track params. I didn't override with --track-params flag, and my number_of_shards was defaulting to 5.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.