Acceptable Search Performance

Devashish_Tyagi · July 2, 2013, 2:57pm

I am using elasticsearch for indexing and searching around 6 million HTML
documents. In order to do that I created an index with 10 shards. Earlier,
I was running just one node on my physical machine (details below). I
allocated 9 GB heap size to Elastic Search. I am performing just simple
search queries (nothing fancy). You can see a typical search query herehttps://gist.github.com/devashishtyagi/5909747.
The average response size from elasticsearch is ~80 KB. In order to test
the performance of elasticsearch, I created an Apache Jmeter test. The test
would read random words from a search term file I supplied and fetch back
the response from the elasticsearch. The Jmeter tests were performed from a
separate machine but located close by (so not much of network overhead).
This is the result I got

10 threads, 50 requests per thread - 2.3 QPS and average response
time of > 4 sec.
*5 threads, 100 request per thread *- 3.4 QPS and average response
time of > 1sec

Here are some of my index statistics

Number of shards - 10
Number of documents - 5174688
Size of index - 56 GB
Size of a typical shard - 5.5 GB
Number of replicas - 0

My machine configuration

Amazon EC m3.xlarge

RAM - 15 GB
Compute Units - 13
Hard Drive - 1 TB EBS Drive

I went through several search performance related mails on the group and it
feels like that I am getting subpar performance. Or is it an acceptable
search performance ?

During my tests I found out that elasticsearch was getting bottle necked on
disk I/O. So I added 3 more EBS drives to the same machine and started up 3
new elasticsearch nodes on same machine. So now I had 4 elasticsearch nodes
running on the same server. Here the performance test results with this
configuration

10 threads, 50 requests per thread - 11.6 QPS and average response
time of ~ 842 ms.
*7 threads, 100 request per thread *- 13.4 QPS and average response
time of ~ 512 ms.
8 threads, 100 requests per thread - 14.3 QPS and average response
time of ~ 550 ms.

Although this seems like a huge improvement but with 4 drives too
elasticsearch is getting bottle necked on Disk I/O. Is this is expected ?

P.S. I have come across various posts where it is mentioned that routing
greatly improves performance but I have no idea how to use that in my use
case.

Thanks in advance,
Devashish Tyagi

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · July 2, 2013, 3:31pm

Quick analysis

you have highlighting on. This feature tends to work heavily on disk
(doc fetching)
you have size =100. This is an extreme setting. It creates additional
burden for highlighting.
to optimize highlighting, there are options that you do not use yet
(hint: term vector, fast vector highlighter)
http://www.elasticsearch.org/guide/reference/api/search/highlighting/
do not ramp up more than one node per machine, there is not much sense
in it
EBS drives are known to be slow (they go over 1Gbit network channels),
ES is built to scale over machines, not only number of drives, so use
more machines
and, finally, use "query" instead of "filtered query" in the query
unless you know what you want to test (you simply thrash your very large
filter cache when load testing, which is bad for overall performance)

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
ElasticSearch search performance question Elasticsearch	12	778	July 6, 2017
Slow search response time (low CPU utilization) Elasticsearch	7	3391	July 31, 2019
Performance problems Elasticsearch	12	574	July 6, 2017
Search performance Elasticsearch	5	323	July 6, 2017
Concurrent Search in elasticsearch Elasticsearch	7	2135	July 5, 2017

Acceptable Search Performance

Related topics