I'm experimenting with ES (version 0.90.5) for indexing/search
performance. Below is my cluster setup :
- Master on separate machine (node.master: true, node.data : false)
- 4 data nodes on separate machines (node.master: false, node.data:
- index.num_of_shards = 4 & index.num_replicas = 1
Those are the only settings I've changed in the elasticsearch.yml file of
I'm comparing the performance with Solr (essentially SolrCloud) 4.4
version. I indexed 4 million documents where each document has 10 english
sentences (each sentence is about 10 words).
So each primary shard has approximately 1 million docs which is nice as
I've 4 million docs and 4 shards.
I'm running the following queries (I'm using a cluster and no other
processes are running on it apart from ES).
- "*" query - to get all docs (i.e. 4 Million hits)
- "*" query with size set to 100 - to get only top 100 hits
- a term query with size again set to 4 Million to retrieve all rows
- same term query with size set to 100 - to get only top 100 hits
It is encouraging to see the response times of queries 2,3 & 4 when
compared to Solr (SolrCloud) - ES is 2-3 times faster for 2 & 4 and
consistently faster for 3.
But for 1 - it takes hell lot of time compared to SolrCloud. ES took 279
secs where as SolrCloud took 55 secs.
Why is it the "match_all" query is ES is taking so much more time compared
to Solr ? Is the "match_all" query a bottleneck for ES / known limitation
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to email@example.com.
For more options, visit https://groups.google.com/groups/opt_out.