One immediate difference would be that the default number of shards in ES is
5 and the default number of replicas is 1 (i.e. a master and one copy).
The replication factor will mean 2x the storage. Also, I think you get a
local gateway out of the box, so that gives you another copy of all the
shards and replicas making for 4x the actual index size. 2.1Gb is pretty
much 4x 650Mb and so is expected.
Given 5 shards, a query will take longer as it has to map/reduce across the
5 shards.
So for a single-node, single-shard, like-for-like test you should set shards
to 1 and replicas to 0. Then they'll be comparable. But then, of course, you
have negated the reason why you would choose ES in the first place which is
to increase write-throughput and to make your index scalable and much more
available.
Cheers,
On Mon, Aug 1, 2011 at 9:05 PM, Michael Feingold mfeingold@hill30.comwrote:
In my quest for a search platform for our internal needs I am playing
with both Solr and ES. I indexed the same data ~4.5M documents with
the same structure of indexes and run the same query on both.
What surprises me is that for whatever reason ES search is 1.5 times
slower than Solr. Also the Solr index size is 650MB vs 2.1G for ES.
Can you help me figure out what am I missing here?
ES configuration: In my quest for a search platform for our internal
needs I am playing with both Solr and ES. I indexed the same data
~4.5M documents with the same structure of indexes and run the same
query on both.
What surprises me is that for whatever reason ES search is more than 3
times slower than Solr. Also the Solr index size is 650MB vs 2.1G for
ES.
Can you help me figure out what am I missing here?
ES configuration: Elastic Search settings and mapping · GitHub
Solr Schema: Solr schema · GitHub
Query: Query · GitHub
--
Paul Loy
paul@keteracel.com
http://uk.linkedin.com/in/paulloy