Perplexing benchmark result

Hi all,

I'm running some benchmark tests to try out the performance with settings
with various number of shards. (I didnt take the indexing time into
account).

for now I'm just running elasticsearch locally on one machine (localhost),
build index and ran apache bench against it
The size of the index is about 100MB.

The query is as simple as searching text against some fields:

htttp://localhost:9200/media_sources/_search?_routing=1001560&pretty=1&q=account_id:1001560%20and%field1:vampire%20OR%20name:vampire%20

surprisingly, it appeared that the best case for me is to run with 1 shard.
and more shards will only degrade the search time. Is this because my test
is flawed as only running on one machine?

another question about field routing. It appeared that with the same load
setting from apache bench (1 node, 20 concurrent threads of 400 requests),
elasticsearch tends to respond with a lot more failed requests (~175/400),
while without routing the number of failed requests is only about ~12/400.
I was expecting it to be more efficient as it only reads 1 shard at each
request. looks like it made a lot works on 1 node. was _routing designed to
reduce the communications between nodes by limit to 1 shard at a time?
what's the explanation in this case?

Thank you.

Yuhan

--

Hey Yuhan,

please find my comments inline...

On Tuesday, October 16, 2012 4:27:19 AM UTC+2, Yuhan wrote:

Hi all,

I'm running some benchmark tests to try out the performance with settings
with various number of shards. (I didnt take the indexing time into
account).

for now I'm just running elasticsearch locally on one machine (localhost),
build index and ran apache bench against it
The size of the index is about 100MB.

this is tiny and likely all you see is noise! Take some reasonable
datasource > 1M documents unless you won't see what you try to see :slight_smile:

The query is as simple as searching text against some fields:

htttp://localhost:9200/media_sources/_search?_routing=1001560&pretty=1&q=account_id:1001560%20and%field1:vampire%20OR%20name:vampire%20

actually this query is not simple. it uses leading and trailing wildcards
which is a very slow query so you might not measure what you are trying to
measure. Most of the time will be spend in rewriting the query internally
provided you hit any data but with you index size this is still super fast.

surprisingly, it appeared that the best case for me is to run with 1
shard. and more shards will only degrade the search time. Is this because
my test is flawed as only running on one machine?

this is not a surprise actually. you have one machine and multiple shards,
that means you need to send requests to shards, have thread and maybe
network overhead and need to merge results. while with one shard you don't
have this overhead. You need to realize that sharding only makes sense if
you have enough data. if not I'd just use replication and balance load
across machines so you don't need to divide your request and merge.

another question about field routing. It appeared that with the same load
setting from apache bench (1 node, 20 concurrent threads of 400 requests),
elasticsearch tends to respond with a lot more failed requests (~175/400),
while without routing the number of failed requests is only about ~12/400.
I was expecting it to be more efficient as it only reads 1 shard at each
request. looks like it made a lot works on 1 node. was _routing designed to
reduce the communications between nodes by limit to 1 shard at a time?
what's the explanation in this case?

can you paste / gist the failures you see?

simon

Thank you.

Yuhan

--

Hi Simon,

Thanks for the analysis. looks like my data is small enough to go with 1
shard. :slight_smile:

the test was conducted with apache bench. so I only got back a failure
count and document size. will do more testing and find out the actual
output.

Thank you.

Yuhan

Document Path:
/media_sources/_search?pretty=1&q=account_id:1001560%20and%20long_description:vampire%20OR%20name:vampire%20OR%20short_description:vampire%20OR%20tags:vampire
Document Length: 6081 bytes

Concurrency Level: 20
Time taken for tests: 19.700 seconds
Complete requests: 400
Failed requests: 143

On Tue, Oct 16, 2012 at 1:00 AM, simonw
simon.willnauer@elasticsearch.comwrote:

Hey Yuhan,

please find my comments inline...

On Tuesday, October 16, 2012 4:27:19 AM UTC+2, Yuhan wrote:

Hi all,

I'm running some benchmark tests to try out the performance with settings
with various number of shards. (I didnt take the indexing time into
account).

for now I'm just running elasticsearch locally on one machine
(localhost), build index and ran apache bench against it
The size of the index is about 100MB.

this is tiny and likely all you see is noise! Take some reasonable
datasource > 1M documents unless you won't see what you try to see :slight_smile:

The query is as simple as searching text against some fields:
htttp://localhost:9200/media_sources/_search?_routing=
1001560&pretty=1&q=account_id:1001560%20and%field1:vampire
%20OR%20name:vampire%20

actually this query is not simple. it uses leading and trailing wildcards
which is a very slow query so you might not measure what you are trying to
measure. Most of the time will be spend in rewriting the query internally
provided you hit any data but with you index size this is still super fast.

surprisingly, it appeared that the best case for me is to run with 1
shard. and more shards will only degrade the search time. Is this because
my test is flawed as only running on one machine?

this is not a surprise actually. you have one machine and multiple shards,
that means you need to send requests to shards, have thread and maybe
network overhead and need to merge results. while with one shard you don't
have this overhead. You need to realize that sharding only makes sense if
you have enough data. if not I'd just use replication and balance load
across machines so you don't need to divide your request and merge.

another question about field routing. It appeared that with the same load
setting from apache bench (1 node, 20 concurrent threads of 400 requests),
elasticsearch tends to respond with a lot more failed requests (~175/400),
while without routing the number of failed requests is only about ~12/400.
I was expecting it to be more efficient as it only reads 1 shard at each
request. looks like it made a lot works on 1 node. was _routing designed to
reduce the communications between nodes by limit to 1 shard at a time?
what's the explanation in this case?

can you paste / gist the failures you see?

simon

Thank you.

Yuhan

--

--