More on Solr vs ES faceting

drigolin · September 8, 2011, 6:21am

On Sep 7, 2011, at 8:40 PM, jprante wrote:

Hi Dario,

On Sep 7, 7:45 pm, Dario Rigolin drigo...@gmail.com wrote:

ES vs SOLR faceting · GitHub

what I can quickly guess from that gist is

you declare ten facets, all of those facets will "sum up" and slow
down the single ES node. Do you really need ten facets or do they
replicate same data? Will you need to present all of the ten facets to
the user at once? E.g. biblevel_full and class_desc look like
candidates for removal.

We usually need more than those 10.
We full index every unimarc subfield and we create sort and facet fields.
I cannot remove them. We also cannot use stop words because librarians need to find books with titles like "The and or not"...

Jörg you indexed 18M records on 3 ES nodes what's the speed of a facet query on author fields like match_all or "berlin"?
What's your nodes hw configurations?

one shard is much too few if you have a multi core cpu, think about
offering "at least one shard per core", it's my rule of thumb. Then,
the facet computing resource consumption will spread over the cores
more easily.

using 5 shard I was running out of memory in my previous tests.
I can try to use 2 as CPU is a dual core.

More analysis is surely possible with some statistics about the facets
(result length, values, cardinalities), and the documents and queries
you use.

My tests was very simple but I was looking to have numbers about ES performance compared to SOLR.
I know that faceting on large sets is very memory and CPU intensive task and caching is a key point for have good performances.
I was expecting that ES was fast as SOLR doing faceting and looking at others good things ES is able to do we was planning to move from SOLR to ES in our OPAC application but faceting performance on medium recordsets (> 1.5M) make me thinking carefully.
ES scaling is very nice, I can add more nodes and performances increase (In SOLR this cannot be done so easy) but comparing a "single node" pure performance this make me thinking that:

I need to know better how ES faceting works and how can be optimized.
At the moment 11M records are handled easy by a SOLR single node with 8G RAM. If moving to ES means adding more HW and more RAM this is not a simple process for us.

Best regards,

Jörg

Dario Rigolin
drigolin@gmail.com

Topic		Replies	Views
Faceting memory issue ElasticSearch 0.17.6 vs Solr 3.3 Elasticsearch	8	503	July 6, 2017
Detail-questions on ES features Elasticsearch	12	470	July 6, 2017
Greetings! Elasticsearch	8	970	July 6, 2017
Performance killed when faceting on high cardinality fields Elasticsearch	26	2911	July 6, 2017
How to improve performance of facet queries? Elasticsearch	7	1425	July 6, 2017

More on Solr vs ES faceting

Related topics