Number of replicas and query speed

bogdanionescu · January 9, 2012, 10:08am

Hello

I have a 4 node cluster with 7.7M docs indexed. 5 shards.
Fist I've tried with 1 replica (default config) and we tested the
query speed. It worked fine.
The we used the REST API to increase the number of replicas to 4 and
after a while the nodes reflected this change (in shards dirs stored
locally all had a copy of all the 5 shards).
The problems is that the query speed was exactly the same as when
using 1 replica config!
Any ideas why this happens? Shouldn't be any improvement at all?

Clinton_Gormley · January 9, 2012, 10:25am

Hiya

I have a 4 node cluster with 7.7M docs indexed. 5 shards.
Fist I've tried with 1 replica (default config) and we tested the
query speed. It worked fine.
The we used the REST API to increase the number of replicas to 4 and
after a while the nodes reflected this change (in shards dirs stored
locally all had a copy of all the 5 shards).
The problems is that the query speed was exactly the same as when
using 1 replica config!
Any ideas why this happens? Shouldn't be any improvement at all?

ES still has to talk to exactly the same number of shards for your
query. The fact that there are more of them to choose from doesn't
affect your query speed.

Where it will make a difference is when you reach the point that your 1
replica setup is too busy to cope with all of your queries. At that
stage, having more replicas to choose from will help you to scale

clint

bogdanionescu · January 9, 2012, 10:55am

I see... I agree the response time should be the same, but the I
thought the throughput should increase.
I'll try to increase the load and compare the results then.

On Jan 9, 12:25 pm, Clinton Gormley cl...@traveljury.com wrote:

Hiya

I have a 4 node cluster with 7.7M docs indexed. 5 shards.
Fist I've tried with 1 replica (default config) and we tested the
query speed. It worked fine.
The we used the REST API to increase the number of replicas to 4 and
after a while the nodes reflected this change (in shards dirs stored
locally all had a copy of all the 5 shards).
The problems is that the query speed was exactly the same as when
using 1 replica config!
Any ideas why this happens? Shouldn't be any improvement at all?

ES still has to talk to exactly the same number of shards for your
query. The fact that there are more of them to choose from doesn't
affect your query speed.

Where it will make a difference is when you reach the point that your 1
replica setup is too busy to cope with all of your queries. At that
stage, having more replicas to choose from will help you to scale

clint

bogdanionescu · January 9, 2012, 3:50pm

I did some tests under heavier load and I still could not see any
improvement when using more replicas...

On Jan 9, 12:55 pm, bogdaniones...@yahoo.com wrote:

I see... I agree the response time should be the same, but the I
thought the throughput should increase.
I'll try to increase the load and compare the results then.

On Jan 9, 12:25 pm, Clinton Gormley cl...@traveljury.com wrote:

Hiya

I have a 4 node cluster with 7.7M docs indexed. 5 shards.
Fist I've tried with 1 replica (default config) and we tested the
query speed. It worked fine.
The we used the REST API to increase the number of replicas to 4 and
after a while the nodes reflected this change (in shards dirs stored
locally all had a copy of all the 5 shards).
The problems is that the query speed was exactly the same as when
using 1 replica config!
Any ideas why this happens? Shouldn't be any improvement at all?

ES still has to talk to exactly the same number of shards for your
query. The fact that there are more of them to choose from doesn't
affect your query speed.

Where it will make a difference is when you reach the point that your 1
replica setup is too busy to cope with all of your queries. At that
stage, having more replicas to choose from will help you to scale

clint

Berkay_Mollamustafao · January 9, 2012, 4:01pm

Can you give the details of the tests you're running? What are the
variables in your tests? What resource (CPU, Disk IO, etc.) seems to be the
bottleneck? How many client connections do you use? Do you increase the
number? Do the clients run on the same box with servers? Which language is
the client written in, could it be the bottleneck? What are the queries? do
you change the queries or run the same ones? Are you only querying or
updating the indices at the same time as well?
It's not feasible to guess what the reason may be without fully
understanding details of the tests.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Mon, Jan 9, 2012 at 10:50 AM, bogdanionescu3@yahoo.com wrote:

I did some tests under heavier load and I still could not see any
improvement when using more replicas...

On Jan 9, 12:55 pm, bogdaniones...@yahoo.com wrote:

I see... I agree the response time should be the same, but the I
thought the throughput should increase.
I'll try to increase the load and compare the results then.

On Jan 9, 12:25 pm, Clinton Gormley cl...@traveljury.com wrote:

Hiya

I have a 4 node cluster with 7.7M docs indexed. 5 shards.
Fist I've tried with 1 replica (default config) and we tested the
query speed. It worked fine.
The we used the REST API to increase the number of replicas to 4 and
after a while the nodes reflected this change (in shards dirs stored
locally all had a copy of all the 5 shards).
The problems is that the query speed was exactly the same as when
using 1 replica config!
Any ideas why this happens? Shouldn't be any improvement at all?

ES still has to talk to exactly the same number of shards for your
query. The fact that there are more of them to choose from doesn't
affect your query speed.

Where it will make a difference is when you reach the point that your 1
replica setup is too busy to cope with all of your queries. At that
stage, having more replicas to choose from will help you to scale

clint

kimchy · January 9, 2012, 7:49pm

On a 4 node setup with 5 shards and 1 replica increasing the number of
replicas will not change the search performance, since all the nodes are
already "maxed" in terms of search being executed on them. You are just
"making" more shards, but you still have only 4 boxes.

If you have an index that is already spread out, lets say an index with 2
shards and 1 replica on a 10 box cluster (there might be other indices),
then increasing the number of replicas will help then, since you will span
more boxes in this case.

On Mon, Jan 9, 2012 at 5:50 PM, bogdanionescu3@yahoo.com wrote:

I did some tests under heavier load and I still could not see any
improvement when using more replicas...

On Jan 9, 12:55 pm, bogdaniones...@yahoo.com wrote:

I see... I agree the response time should be the same, but the I
thought the throughput should increase.
I'll try to increase the load and compare the results then.

On Jan 9, 12:25 pm, Clinton Gormley cl...@traveljury.com wrote:

Hiya

I have a 4 node cluster with 7.7M docs indexed. 5 shards.
Fist I've tried with 1 replica (default config) and we tested the
query speed. It worked fine.
The we used the REST API to increase the number of replicas to 4 and
after a while the nodes reflected this change (in shards dirs stored
locally all had a copy of all the 5 shards).
The problems is that the query speed was exactly the same as when
using 1 replica config!
Any ideas why this happens? Shouldn't be any improvement at all?

ES still has to talk to exactly the same number of shards for your
query. The fact that there are more of them to choose from doesn't
affect your query speed.

Where it will make a difference is when you reach the point that your 1
replica setup is too busy to cope with all of your queries. At that
stage, having more replicas to choose from will help you to scale

clint

Topic		Replies	Views
Number of replicas and query speed Elasticsearch	2	385	July 6, 2017
Search performance improovment by adding replicas shards Elasticsearch	4	1572	December 19, 2019
Does adding multiple shard replicas increase performance? Elasticsearch	5	2301	March 28, 2017
Will increasing the no. shards in a large cluster affect the query performance? Elasticsearch	7	2960	July 5, 2017
How to improve query performance? Elasticsearch	3	571	July 6, 2017

Number of replicas and query speed

Related topics