Esrally client option

tmdgk490255 · April 10, 2023, 5:20am

there is some options to specify the number of clients for certain operation in rally track

"schedule": [
    {
      "operation": "force-merge",
      "clients": 1 // here
    },
    {
      "operation": "match-all-query",
      "clients": 4,
      "warmup-iterations": 1000,
      "iterations": 1000,
      "target-throughput": 100
    }

so when I configure 4 clients in track file, does each client have different ip address? or just one same ip address which is same to the server installed esrally

also is there any source code assigning client's addresses? (not es client) I was looking for esrally github repository but could'nt find

dliappis · April 11, 2023, 3:19pm

so when I configure 4 clients in track file, does each client have different ip address? or just one same ip address which is same to the server installed esrally

No, it will use the same IP address under the hood.

also is there any source code assigning client's addresses? (not es client) I was looking for esrally github repository but could'nt find

Rally has a very thin abstraction on top of the Elasticsearch Python client, which you can find here.

It doesn't have specific logic about which interface it will bind per connection, it relies on the ES Python client. I believe that the interface used will be whatever the OS decides by default, e.g. on Linux what ip route get <target ip> shows.

tmdgk490255 · April 12, 2023, 5:05am

So you mean when I configure the track file like this

"schedule": [
    {
      "operation": "match-all-query",
      "clients": 4, // four client
      "warmup-iterations": 1000,
      "iterations": 1000,
      "target-throughput": 100
    }

the clients requesting "match-all-query" operations are created by this source code and those will have all same ip address, right?

If all clients use same IP address then is there a point to test es with rally? I thought elasticsearch master nodes distribute requests depending on client's ip address. I believe the test has meaning only when ES handles multiple clients who have different unique ip so that ES can assign the operations to as many nodes as possible . plz correct me if Im wrong

dliappis · April 12, 2023, 6:00am

Oh I see the confusion, apologies. You are asking about the target IP address (i.e. Elasticsearch's aka foreign) whereas I thought you are asking about the local address that Rally binds to.

When you've provided a list of IP addresses via Rally's --target-hosts cli option, then again Rally doesn't do anything special but relies on the es-python client's logic which, by default, is to connect in a round-robin fashion. If you are using Elastic Cloud (if not, give it a try!) there will be a single endpoint that has the smarts to do the loadbalancing and routing.

tmdgk490255 · April 12, 2023, 7:50am

Im not saying the client-server relation between rally and elasticsearch. I thought the --target-host command option is to specify which es node would be connect to my esrally and this isn't what Im asking.

When I used the word 'client', I mean the application who send searching or indexing operaions to ES cluster. In elk stack, it would be logstash and efk, would be fluentd. If there are three logstash servers on each ec2 instances, then es nodes recieve requests from three different ip addresses.

In rally's case, I thought we do not need to create the client server (like I said, fluentd or logstash) which send request to es. just need to specify 'client' option on track file.

  "schedule": [
    {
      "operation": "force-merge",
      "clients": 1
    },
    {
      "operation": "match-all-query",
      "clients": 8, // here
      "warmup-iterations": 1000,
      "iterations": 1000,
      "target-throughput": 1000 // also here
    }
  ]

if you specify 'target-throughput: 1000' with 8 clients, it means that each client will issue 125 (= 1000 / 8) requests per second. In total, all clients will issue 1000 requests each second.

I quoted this paragraph from rally docs. And finally, here is the question. When I specify 8 clients on search operation, does es cluster recieve search requests from 8 different ip-addresses? Or just from one ip address? (might be esrally's server, idk)

dliappis · April 12, 2023, 8:59am

If you are running just one Rally instance, on one server where there is only one network interface that can connect to various Elasticsearch nodes, then the source address will naturally be the same regardless of the amount of clients. I mean, there are no other IP addresses available to use, so how would the established connection have any other source IP?

Now, if you have a custom configuration on your server with several network interfaces (say eth0, eth1 etc.) configured such that e.g. eth0 -> es0, eth1 -> es1 etc. then, yes, different connections will use different IP addresses, depending on the node that the elasticsearch python client decided to connect to.

Finally, if you use Rally in distributed mode, where you are running several Rally instances on different machines, then it's like the previous case, if you specify >1 clients, various connections will have different source IP addresses since each Rally process runs on a different machine with a different network interface with a (different) dedicated IP.

As I said before, the routing decision is done at the operating system level similarly to what ip route get <some es node ip address> would show you.

dliappis · April 12, 2023, 9:32am

This is wrong. All nodes route request to other nodes, but routing doesn't depend on the source IP.

See also Query Phase | Elasticsearch: The Definitive Guide [master] | Elastic for a deeper explanation:

When a search request is sent to a node, that node becomes the coordinating node. It is the job of this node to broadcast the search request to all involved shards, and to gather their responses into a globally sorted result set that it can return to the client.

Ideally, if your target environment has dedicated master nodes, you'll exclude those from --target-hosts so that they can focus on their master responsibilities.

system · May 10, 2023, 9:33am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ESrally challenge configuration Elasticsearch rally	2	554	January 13, 2020
Rally: customize behaviour of each client Elasticsearch rally	5	404	September 8, 2022
Rally and Httperf Elasticsearch rally	5	1117	May 1, 2018
Esrally: error: --target-hosts and --client-options must define the same keys for multi cluster setups Elasticsearch rally	9	751	September 8, 2022
Rally logs shows max_connections=1, but the doc said at least 256? And what does max_connections really do? it seems es client didn't support this param Elasticsearch rally	2	383	July 2, 2021

Esrally client option

Related topics