Adding nodes does not seem to speed up indexing

I ran load tests to find out how fast I could index documents. I am using the java client with ElasticSearch 0.7.0, and I found that adding more data nodes didn't really speed things up. Having only one data node was faster; I guess because it did not have to do any routing or balancing, but it is not sustainable for my needs. Having 2 and 4 data nodes gave me about the same transactions per second. Is this expected? Any advice on speeding up indexing throughput? Thanks.

Threads Itr/thread Data Nodes Trans per Sec
20 2000 1 1884.39
20 2000 2 946.10
20 2000 4 1061.04
40 4000 4 876.77

Are you maxing up on resources no your client side? Also, if you use the
default 5 shards each with 1 replica, with only one node, there will only be
5 shards allocated. With 2 nodes, 10 shards will be allocated, and index
operation will need to be replicated to another node.

Which client do you use? TransportClient or Node client?

cheers,
shay.banon

On Tue, May 25, 2010 at 10:35 PM, John Chang jchangkihtest2@gmail.comwrote:

I ran load tests to find out how fast I could index documents. I am using
the java client with Elasticsearch 0.7.0, and I found that adding more data
nodes didn't really speed things up. Having only one data node was faster;
I guess because it did not have to do any routing or balancing, but it is
not sustainable for my needs. Having 2 and 4 data nodes gave me about the
same transactions per second. Is this expected? Any advice on speeding
up
indexing throughput? Thanks.

Threads Itr/thread Data Nodes Trans per Sec
20 2000 1 1884.39
20 2000 2 946.10
20 2000 4 1061.04
40 4000 4 876.77

View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Adding-nodes-does-not-seem-to-speed-up-indexing-tp842987p842987.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

Also, how are you indexing the data? Is it a single thread indexing? In this
case, you will not see improvement.

On Wed, May 26, 2010 at 9:23 AM, Shay Banon shay.banon@elasticsearch.comwrote:

Are you maxing up on resources no your client side? Also, if you use the
default 5 shards each with 1 replica, with only one node, there will only be
5 shards allocated. With 2 nodes, 10 shards will be allocated, and index
operation will need to be replicated to another node.

Which client do you use? TransportClient or Node client?

cheers,
shay.banon

On Tue, May 25, 2010 at 10:35 PM, John Chang jchangkihtest2@gmail.comwrote:

I ran load tests to find out how fast I could index documents. I am using
the java client with Elasticsearch 0.7.0, and I found that adding more
data
nodes didn't really speed things up. Having only one data node was
faster;
I guess because it did not have to do any routing or balancing, but it is
not sustainable for my needs. Having 2 and 4 data nodes gave me about the
same transactions per second. Is this expected? Any advice on speeding
up
indexing throughput? Thanks.

Threads Itr/thread Data Nodes Trans per Sec
20 2000 1 1884.39
20 2000 2 946.10
20 2000 4 1061.04
40 4000 4 876.77

View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Adding-nodes-does-not-seem-to-speed-up-indexing-tp842987p842987.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

On Tue, 2010-05-25 at 12:35 -0700, John Chang wrote:

I ran load tests to find out how fast I could index documents. I am using
the java client with Elasticsearch 0.7.0, and I found that adding more data
nodes didn't really speed things up. Having only one data node was faster;
I guess because it did not have to do any routing or balancing, but it is
not sustainable for my needs. Having 2 and 4 data nodes gave me about the
same transactions per second. Is this expected? Any advice on speeding up
indexing throughput? Thanks.

I've seen the same thing, and kimchy has just explained to me on IRC how
things should work, so I thought I'd share it here for those who, like
me, are new to distributed workloads.

There are two concerns being addresses in a cluster:
 - performance/scalability

  • high availability/redundancy

There are three variables which we can change to affect the above:
 - index.number_of_shards (default 5)

  • index.number_of_replicas (default 2)
  • the number of nodes we start

index.number_of_shards is intended to improve performance - the more
shards (within reason) the more we can divide up the work between nodes.

index.number_of_replicas is intended to improve redundancy AND search
performance.

  • The more replicas, the better chance that our cluster will keep going
    when a node goes down. The number of replicas is in addition to the
    primary shard, so with the default of 2, we have the primary plus 2
    replicas (a total of 3)
  • The more replicas, the more that search requests can be distributed
    across more nodes

Having more replicas comes at a cost. A newly indexed document needs to
be added to the primary shard, then copied to each replica.

So for example:

  • we start 3 nodes
  • we create an index with 5 shards and 2 replicas.
  • this gives us a total of 15 shards (5 primaries and 10 replicas).
  • searching across this cluster is faster than searching against one
    node, as the data is spread across 3 nodes
  • indexing a new doc in this cluster will be slightly slower than
    indexing to a single node, because of the replication costs

Now we add a fourth node.

  • we still have 15 shards, but now they are spread across 4 machines,
  • searching speed improves because the data is spread across more
    machines
  • indexing speed improves because now each node is doing less work
    than a single node

So to see similar indexing performance improvements, you could decrease
the number of replicas to 1, but then you're potentially increasing the
risk of cluster downtime.

This is as I understand it - please correct me if I have something
wrong.

clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.

Basically its correct. A few notes:

  1. The default number of replicas is 1 and not 2 :).
  2. With more replicas, it means that each shard that exists within the
    replication group is basically doing the indexing. Note, replication is done
    using async IO, so its not a thread per shard target waiting for it to
    finish. Also, I plan to add async replication, which will improve
    performance by paying in being async.

On Wed, May 26, 2010 at 2:28 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

On Tue, 2010-05-25 at 12:35 -0700, John Chang wrote:

I ran load tests to find out how fast I could index documents. I am
using
the java client with Elasticsearch 0.7.0, and I found that adding more
data
nodes didn't really speed things up. Having only one data node was
faster;
I guess because it did not have to do any routing or balancing, but it is
not sustainable for my needs. Having 2 and 4 data nodes gave me about
the
same transactions per second. Is this expected? Any advice on speeding
up
indexing throughput? Thanks.

I've seen the same thing, and kimchy has just explained to me on IRC how
things should work, so I thought I'd share it here for those who, like
me, are new to distributed workloads.

There are two concerns being addresses in a cluster:
 - performance/scalability

  • high availability/redundancy

There are three variables which we can change to affect the above:
 - index.number_of_shards (default 5)

  • index.number_of_replicas (default 2)
  • the number of nodes we start

index.number_of_shards is intended to improve performance - the more
shards (within reason) the more we can divide up the work between nodes.

index.number_of_replicas is intended to improve redundancy AND search
performance.

  • The more replicas, the better chance that our cluster will keep going
    when a node goes down. The number of replicas is in addition to the
    primary shard, so with the default of 2, we have the primary plus 2
    replicas (a total of 3)
  • The more replicas, the more that search requests can be distributed
    across more nodes

Having more replicas comes at a cost. A newly indexed document needs to
be added to the primary shard, then copied to each replica.

So for example:

  • we start 3 nodes
  • we create an index with 5 shards and 2 replicas.
  • this gives us a total of 15 shards (5 primaries and 10 replicas).
  • searching across this cluster is faster than searching against one
    node, as the data is spread across 3 nodes
  • indexing a new doc in this cluster will be slightly slower than
    indexing to a single node, because of the replication costs

Now we add a fourth node.

  • we still have 15 shards, but now they are spread across 4 machines,
  • searching speed improves because the data is spread across more
    machines
  • indexing speed improves because now each node is doing less work
    than a single node

So to see similar indexing performance improvements, you could decrease
the number of replicas to 1, but then you're potentially increasing the
risk of cluster downtime.

This is as I understand it - please correct me if I have something
wrong.

clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.

One more thing, you should do much more iterations to get proper performance
numbers. And discard the first 10,000 because of "warming" of the servers.

On Wed, May 26, 2010 at 2:39 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Basically its correct. A few notes:

  1. The default number of replicas is 1 and not 2 :).
  2. With more replicas, it means that each shard that exists within the
    replication group is basically doing the indexing. Note, replication is done
    using async IO, so its not a thread per shard target waiting for it to
    finish. Also, I plan to add async replication, which will improve
    performance by paying in being async.

On Wed, May 26, 2010 at 2:28 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

On Tue, 2010-05-25 at 12:35 -0700, John Chang wrote:

I ran load tests to find out how fast I could index documents. I am
using
the java client with Elasticsearch 0.7.0, and I found that adding more
data
nodes didn't really speed things up. Having only one data node was
faster;
I guess because it did not have to do any routing or balancing, but it
is
not sustainable for my needs. Having 2 and 4 data nodes gave me about
the
same transactions per second. Is this expected? Any advice on
speeding up
indexing throughput? Thanks.

I've seen the same thing, and kimchy has just explained to me on IRC how
things should work, so I thought I'd share it here for those who, like
me, are new to distributed workloads.

There are two concerns being addresses in a cluster:
 - performance/scalability

  • high availability/redundancy

There are three variables which we can change to affect the above:
 - index.number_of_shards (default 5)

  • index.number_of_replicas (default 2)
  • the number of nodes we start

index.number_of_shards is intended to improve performance - the more
shards (within reason) the more we can divide up the work between nodes.

index.number_of_replicas is intended to improve redundancy AND search
performance.

  • The more replicas, the better chance that our cluster will keep going
    when a node goes down. The number of replicas is in addition to the
    primary shard, so with the default of 2, we have the primary plus 2
    replicas (a total of 3)
  • The more replicas, the more that search requests can be distributed
    across more nodes

Having more replicas comes at a cost. A newly indexed document needs to
be added to the primary shard, then copied to each replica.

So for example:

  • we start 3 nodes
  • we create an index with 5 shards and 2 replicas.
  • this gives us a total of 15 shards (5 primaries and 10 replicas).
  • searching across this cluster is faster than searching against one
    node, as the data is spread across 3 nodes
  • indexing a new doc in this cluster will be slightly slower than
    indexing to a single node, because of the replication costs

Now we add a fourth node.

  • we still have 15 shards, but now they are spread across 4 machines,
  • searching speed improves because the data is spread across more
    machines
  • indexing speed improves because now each node is doing less work
    than a single node

So to see similar indexing performance improvements, you could decrease
the number of replicas to 1, but then you're potentially increasing the
risk of cluster downtime.

This is as I understand it - please correct me if I have something
wrong.

clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.

Thanks for the info. I'll try again, ignoring the first 10k. To answer your questions above, I am using the java-based node clientfrom the 0.7.1 release, and my client is multi-threaded.

Yea, don't know how I missed the multi threaded part in my second response,
too late I guess... . I would also check if your stress client is not maxing
up on resources (CPU, Mem). One more thing, is that on a single index, or
several indices?

On Wed, May 26, 2010 at 9:50 PM, John Chang jchangkihtest2@gmail.comwrote:

Thanks for the info. I'll try again, ignoring the first 10k. To answer
your
questions above, I am using the java-based node clientfrom the 0.7.1
release, and my client is multi-threaded.

View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Adding-nodes-does-not-seem-to-speed-up-indexing-tp842987p845702.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.