Hi all,
I'm experimenting with ES (version 0.90.5) for indexing/search
performance. Below is my cluster setup :
- Master on separate machine (node.master: true, node.data : false)
- 4 data nodes on separate machines (node.master: false, node.data:
true) - index.num_of_shards = 4 & index.num_replicas = 1
Those are the only settings I've changed in the elasticsearch.yml file of
respective nodes.
I'm comparing the performance with Solr (essentially SolrCloud) 4.4
version. I indexed 4 million documents where each document has 10 english
sentences (each sentence is about 10 words).
So each primary shard has approximately 1 million docs which is nice as
I've 4 million docs and 4 shards.
I'm running the following queries (I'm using a cluster and no other
processes are running on it apart from ES).
- "*" query - to get all docs (i.e. 4 Million hits)
- "*" query with size set to 100 - to get only top 100 hits
- a term query with size again set to 4 Million to retrieve all rows
- same term query with size set to 100 - to get only top 100 hits
It is encouraging to see the response times of queries 2,3 & 4 when
compared to Solr (SolrCloud) - ES is 2-3 times faster for 2 & 4 and
consistently faster for 3.
But for 1 - it takes hell lot of time compared to SolrCloud. ES took 279
secs where as SolrCloud took 55 secs.
Why is it the "match_all" query is ES is taking so much more time compared
to Solr ? Is the "match_all" query a bottleneck for ES / known limitation
for ES.
Thanks,
Phani.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.