Filters and facets and tuning

Vinicius_Carvalho · November 21, 2012, 4:48pm

Hi there! We are trying to tune our ES server, we are having many slow
queries (from 200ms up to 4s) now that we launched it on real world usage.

We have a small index (80gb) 25M docs (we store the source, that is pretty
big) and we have 8 document types.

We are using a default 5 shards, and we have 2 replicas (3 servers right
now) 64GB Intel Xeon 3 Ghz with 24 cores each.

Thanks to bigdesk we are monitoring, and CPU usage is minimal, less than
5% on average, we are having around 5-6 QPS on each node.

I'm starting to scratch the surface of tuning ES, still a long road ahead
but one thing is that I would like to use filters more often when executing
faceting navigation.

So far our approach had been to append the selected facet to the query.

This works well, but I think (please correct me if I'm wrong) that the
preferred approach would be using facet_filters instead.

I noticed that if I append the "filter" to the query, the hits gets
narrowed down. But the facets counting do not reflect the new query
results, they are still counters for the first match_all query.

I also noticed that to get the new counters I need to add the proper
facet_filter to each facet, is this right? The main problem is that we have
around 20 facets, does that means that I have to keep stacking
facet_filters for each facet based on the user interaction?

What is the best way to tackle this: Use filters + facet_filters or just
add more terms to the query to narrow it down?

BTW this is a sample slow query we have:

{

"from": 2784,
"size": 96,
"query": {
"bool": {
"must_not": {
"query_string": {
"query": "published:F OR active:F"
}
}
}
},
"sort": [
{
"albumDownloadsMonth": {
"order": "desc"
}
}
],
"facets": {
"genre": {
"terms": {
"field": "genreName",
"size": 999,
"all_terms": true
}
},
"compilation": {
"terms": {
"field": "compilation",
"size": 999,
"all_terms": true
}
},
"singleEp": {
"terms": {
"field": "singleEp",
"size": 999,
"all_terms": true
}
},
"editorsPick": {
"terms": {
"field": "editorsPick",
"size": 999,
"all_terms": true
}
},
"style": {
"terms": {
"field": "styleName",
"size": 999,
"all_terms": true
}
},
"explicit": {
"terms": {
"field": "explicit",
"size": 999,
"all_terms": true
}
},
"alpha": {
"terms": {
"field": "nameLetter",
"size": 999,
"all_terms": true
}
},
"freeTracks": {
"terms": {
"field": "freeTracks",
"size": 999,
"all_terms": true
}
},
"live": {
"terms": {
"field": "live",
"size": 999,
"all_terms": true
}
},
"new": {
"range": {
"field": "releaseDate",
"ranges": [
{
"from": "1353165612946"
},
{
"from": "1352906412946"
},
{
"from": "1355214180242"
},
{
"from": "1352039984018"
}
]
}
},
"decade": {
"range": {
"field": "releaseDate",
"ranges": [
{
"from": "-2208970740000",
"to": "-1893437940000"
},
{
"from": "-1893437940000",
"to": "-1577905140000"
},
{
"from": "-1577905140000",
"to": "-1262285940000"
},
{
"from": "-1262285940000",
"to": "-946753140000"
},
{
"from": "-946753140000",
"to": "-631133940000"
},
{
"from": "-631133940000",
"to": "-315601140000"
},
{
"from": "-315601140000",
"to": "18060000"
},
{
"from": "18060000",
"to": "315550860000"
},
{
"from": "315550860000",
"to": "631170060000"
},
{
"from": "631170060000",
"to": "946702860000"
},
{
"from": "946702860000",
"to": "1262322060000"
},
{
"from": "1262322060000",
"to": "1577854860000"
}
]
}
},
"multiDisc": {
"range": {
"field": "numDiscs",
"ranges": [
{
"from": "2"
}
]
}
},
"rated": {
"range": {
"field": "rating",
"ranges": [
{
"from": "3.50"
}
]
}
},
"advance": {
"range": {
"field": "releaseDate",
"ranges": [
{
"from": "1353511212946"
}
]
}
}
}
}

Best regards

--

dadoonet · November 22, 2012, 6:51am

Answers inline.

HTH

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 21 nov. 2012 à 17:48, Vinicius Carvalho viniciusccarvalho@gmail.com a écrit :

Hi there! We are trying to tune our ES server, we are having many slow queries (from 200ms up to 4s) now that we launched it on real world usage.

We have a small index (80gb) 25M docs (we store the source, that is pretty big) and we have 8 document types.

We are using a default 5 shards, and we have 2 replicas (3 servers right now) 64GB Intel Xeon 3 Ghz with 24 cores each.

Thanks to bigdesk we are monitoring, and CPU usage is minimal, less than 5% on average, we are having around 5-6 QPS on each node.

I'm starting to scratch the surface of tuning ES, still a long road ahead but one thing is that I would like to use filters more often when executing faceting navigation.

So far our approach had been to append the selected facet to the query.

This works well, but I think (please correct me if I'm wrong) that the preferred approach would be using facet_filters instead.

I noticed that if I append the "filter" to the query, the hits gets narrowed down. But the facets counting do not reflect the new query results, they are still counters for the first match_all query.

True

I also noticed that to get the new counters I need to add the proper facet_filter to each facet, is this right? The main problem is that we have around 20 facets, does that means that I have to keep stacking facet_filters for each facet based on the user interaction?

Yes

What is the best way to tackle this: Use filters + facet_filters or just add more terms to the query to narrow it down?

Add your filters to the facets. Filters reduce the dataset before executing the query. They are more efficients.

BTW this is a sample slow query we have:

(skipped)

--

Topic		Replies	Views
Performance of multiple facets with query filters Elasticsearch	3	329	July 6, 2017
Filter, facets and filtered query Elasticsearch	11	2054	July 6, 2017
Adaptive faceted searching Elasticsearch	3	554	July 6, 2017
How to implement ES faceted search like Linkedin Elasticsearch	3	762	July 6, 2017
Large number of facets slow down the filter query Elasticsearch	2	530	July 6, 2017

Filters and facets and tuning

Related topics