I am using elasticsearch for indexing and searching around 6 million HTML
documents. In order to do that I created an index with 10 shards. Earlier,
I was running just one node on my physical machine (details below). I
allocated 9 GB heap size to Elastic Search. I am performing just simple
search queries (nothing fancy). You can see a typical search query herehttps://gist.github.com/devashishtyagi/5909747.
The average response size from elasticsearch is ~80 KB. In order to test
the performance of elasticsearch, I created an Apache Jmeter test. The test
would read random words from a search term file I supplied and fetch back
the response from the elasticsearch. The Jmeter tests were performed from a
separate machine but located close by (so not much of network overhead).
This is the result I got
-
10 threads, 50 requests per thread - 2.3 QPS and average response
time of > 4 sec. - *5 threads, 100 request per thread *- 3.4 QPS and average response
time of > 1sec
Here are some of my index statistics
- Number of shards - 10
- Number of documents - 5174688
- Size of index - 56 GB
- Size of a typical shard - 5.5 GB
- Number of replicas - 0
My machine configuration
Amazon EC m3.xlarge
- RAM - 15 GB
- Compute Units - 13
- Hard Drive - 1 TB EBS Drive
I went through several search performance related mails on the group and it
feels like that I am getting subpar performance. Or is it an acceptable
search performance ?
During my tests I found out that elasticsearch was getting bottle necked on
disk I/O. So I added 3 more EBS drives to the same machine and started up 3
new elasticsearch nodes on same machine. So now I had 4 elasticsearch nodes
running on the same server. Here the performance test results with this
configuration
-
10 threads, 50 requests per thread - 11.6 QPS and average response
time of ~ 842 ms. - *7 threads, 100 request per thread *- 13.4 QPS and average response
time of ~ 512 ms. -
8 threads, 100 requests per thread - 14.3 QPS and average response
time of ~ 550 ms.
Although this seems like a huge improvement but with 4 drives too
elasticsearch is getting bottle necked on Disk I/O. Is this is expected ?
P.S. I have come across various posts where it is mentioned that routing
greatly improves performance but I have no idea how to use that in my use
case.
Thanks in advance,
Devashish Tyagi
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.