On Sep 9, 2011, at 12:11 AM, Jason Rutherglen wrote:
We need more data to answer that. Solr intersects [cached] bit sets
which can be very fast! I think ES uses a field cache mechanism, I
don't know if it implements bit sets. Per-segment faceting is
possible with bit sets. It's just software.
This was my original hypothesis. Solr faceting algorithms are more efficients.
I have tested on my desktop PC Solr with 11M docs doing faceting in a second with huge record sets.
In that scenario Solr with 5G RAM is able to work, ES with 7G is continue to give OOM.
Also memory usage in ES during faceting is larger than Solr and I presume that "values" of facets are loaded and intersected.
I presume re-implementing how ES does faceting isn't an easy task and I like a lot ES features.
If we where doing simple searches I will switch to ES from SOLR because it's so easy to handle a cluster, distributing searches and creating new databases and configure via "software" instead via xml file.
I hope this discussion will help improve ES.
Maybe Yonik, one of the fathers of Solr, can give some input idea to Shay about how to make ES faceting better.
Ciao
On Thu, Sep 8, 2011 at 6:07 PM, Andy selforganized@gmail.com wrote:
Solr does have some per-segment faceting capabilites. They are not used by
default because it's slower unless you are rapidly updating the index.So is ES's per-segment faceting the reason why Dario found it to be so
much slower than Solr once indexing is finished (4-7 seconds for ES
vs. sub-seconds for Solr)?Is there any way to tune ES to speed that up?
Dario Rigolin
drigolin@gmail.com