Disappointing insert throughput (non-bulk)


To get acquainted with Elasticsearch and test it's write throughput I've set up a simple PHP script that inserts a JSON with 3 fields, like so:

private function registerName($name) {

	$data = [
		"name" => $name,
		"ip" => $ip,
		"registerTime" => time()
	$db = $this->dbConnect();
	// Insert X
	$params = [
		'index' => 'users',
		'type' => 'json',
		'body' => $data
	$response = $db->index($params);

	return $response["_id"];;

function dbConnect() {
	return Elasticsearch\ClientBuilder::create()->setHosts([""])->build();

unfortunately I'm merely getting a throughput of 5.5k docs/s through this script. I'm running the script from a different server than the Elasticsearch server, because when I had them on the same machine I was getting only 3.3k/sec (makes sense.) The script is run with a concurrency of 800.

The two machines (Elasticsearch & PHP-client) are both c5.2xlarge instances at Amazon AWS. This is a standard Elasticsearch install with no settings altered other than the IP adress it binds to. I upped the disk of the ES instance to 1TB, which gives me 3000IOPS. I benched it at 148 MB/s (megabyte) using 'dd'. While firing the PHP script I can see the disk I/O is around 10MB/s, sometimes touching 30MB/s and then quickly dropping back down. The PHP client doesn't appear to be the bottleneck considering adding another one doesn't increase my throughput.

I was expecting roughly 65k docs/s so this was rather disappointing. Interestingly, benching the instance using Rally does give me 66k/s on the 'index-append' test but I'm not sure if that test is comparable to my use-case. I can also see the instance disk is running at ~100MB/s during this test, so a lot more than the ~10MB/s I'm getting with my own test. I'm guessing the Rally test uses one (or a minimal # of) connection(s), and is bulking them as much as possible.

Can someone tell me if ES fits my use-case (many individual clients/connections, short-lived connection, 1 insert each) and if so, what I need to do to reach the desired throughput? I was hoping to reach ~200k/sec after sharding on 3 instances.

Perhaps interesting, here's a HTOP snapshot during the test with the PHP client:

And here's the output from wrk (the tool used to call the PHP script over HTTP)

Running 1m test @
  4 threads and 800 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   166.97ms  153.43ms   1.93s    86.04%
    Req/Sec     1.41k   187.60     2.40k    72.17%
  336983 requests in 1.00m, 84.46MB read
  Socket errors: connect 0, read 0, write 0, timeout 55
Requests/sec:   5610.74
Transfer/sec:      1.41MB

please note that I realise this is a sub-optimal approach. I'm using it because it's fairly realistic for our use-case (it's mimicking 'real visitors').

Full Rally results: https://pastebin.com/xpJjmkuT

(Christian Dahlqvist) #2

Bulk indexing will, as described in the documentation, give much better throughput than indexing individual documents, so what you are seeing is expected. Rally uses bulk requests, which explains the difference in performance. Why are you not using bulk requests when indexing?


Thanks for your answer. The reason is our use case; user registration, using PHP as server-side language. 1 user = 1 PHP process = 1 connection. And the insert may not be delayed, since the user is waiting for their user-id so they can continue with our service. No possibility for bulking there. I guess that means ES is not a good fit for this use-case. Which is fine off course :wink:

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.