Facets/Aggregations and excluding filters

pzaleski · August 21, 2015, 7:20am

We're currently evaluating elastic 1.7 vs solr 5. One of the use cases are Facets/Aggregations with filters. When filter is set for faceted/aggregated filter it should be excluded, so we won't get just one result.

In solr we're achieving it using filter tagging as follows:

select?qt=fieldsearch&q=*:*&start=0&rows=0
&fq={!tag=FILTER1}((field_1:("3098")))
&facet=true&facet.limit=-1&facet.mincount=1
&facet.field={!ex=FILTER1 key=FILTER1}field_1
&facet.field=field_2
&facet.field=field_3
&wt=json&indent=off`

To get similar results in elastic we're using following query:

{
    "size": 0,
    "aggs": {
        "field1": {
            "terms": {
                "field": "field1",
                "size": 0
            }
        },
        "facets_with_all_filters": {
            "filter": {
                "bool": {
                    "must": [
                        {
                            "term": {
                                "field1": "3102"
                            }
                        }
                    ]
                }
            },
            "aggs": {
                "field2": {
                    "terms": {
                        "field": "field2",
                        "size": 0
                    }
                },
                "field3": {
                    "terms": {
                        "field": "field3",
                        "size": 0
                    }
                }
            }
        }
    }
}

solr is performing much better in that case, is there better way to write such query in elastic or optimize it?

polyfractal · August 21, 2015, 11:44am

Apologies in advance, I'm not super familar with Solr's syntax. But I'm pretty sure those queries are asking for different results. The solr query is asking for:

All search hits which have field_1:"3098"
All terms in field_1 whose documents match FILTER2 + FILTER3
All terms in field_2 whose documents match FILTER1 + FILTER3
All terms in field_2 whose documents match FILTER1 + FILTER2

You have some syntax problems, but assuming the hierarchy means nesting, your Elasticsearch aggregation is asking for:

All search hits in the index (no filter)
All terms in field_1
All terms in field_2 whose documents match field_1:3102
- For each term in the previous aggregation, generate all terms in field_3

A more comparable query would look something like this (annotated with comments):

{
  "size": 0,
  "query": {
    "filtered": {
      "query": { "match_all": {} },
      "filter": {
        "term": { "field_1": "3908" } // &fq={!tag=FILTER1}((field_1:("3098")))
      }
    }
  },
  "aggs": {

    // &facet.field={!ex=FILTER1 key=FILTER1_FACET}field_1
    "FILTER1_FACET": {
      "global": {},    // Global context, since the search is filtering FILTER1 and we don't want that
      "aggs": {
        "filter": {
          "bool": {
            "must": [
              { "term": { "<FILTER2 FIELD>": "<some value>" } },
              { "term": { "<FILTER3 FIELD>": "<some value>" } }
            ]
          }
        },
        "aggs": {
          "FILTER1_FACET_TERMS": {
            "terms": { "field": "field_1" }
          }
        }
      }
    },

    // &facet.field={!ex=FILTER2 key=FILTER2_FACET}field_2
    "FILTER2_FACET": {
      "filter": {  // Already includes FILTER1 from filtered query, so include FILTER3
        "term": { "<FILTER3 FIELD>": "<some value>" }  
      },
      "aggs": {
        "FILTER2_FACET_TERMS": {
          "term": {
            "field": "field_2"
          }
        }
      }
    },

    // &facet.field={!ex=FILTER3 key=FILTER3_FACET}field_2
    "FILTER3_FACET": {
      "filter": {  // Already includes FILTER1 from filtered query, so include FILTER2
        "term": { "<FILTER2 FIELD>": "<some value>" }
      },
      "aggs": {
        "FILTER3_FACET_TERMS": {
          "term": {
            "field": "field_2"
          }
        }
      }
    }
  }
}

Now, as far as performance, it's pretty hard to compare. ES is distributed by nature, while Solr has the benefit of being monolithic. If I understand correctly, Solr also includes a certain amount of aggressive caching that is invalidated when you index new documents, so comparisons can easily be skewed if you are just running searches.

Topic		Replies	Views
Exclude one filter per facet Elasticsearch	5	1325	July 6, 2017
Tagging and excluding Filters Elasticsearch	1	494	December 6, 2017
Aggregations to show all values with filter Elasticsearch	1	456	July 29, 2021
How to implement useful facet aggregations? Elasticsearch	2	345	July 6, 2017
How to tag and exclude filters? Elasticsearch	3	1447	July 6, 2017

Facets/Aggregations and excluding filters

Related topics