Slow wildcard query with fast term filters


(Roman Margolis) #1

Hi, i have a simple query that looks like this:

{
   "query": {
        "bool": {
            "filter": {
                 "bool": {
                       "must": [
                              { "term": { "a": "term1" }},
                              { "term": { "a": "term2" }},
                              { "term": { "a": "term3" }},
                              { "term": { "a": "term4" }},
                              { "term": { "a": "term5" }},
                              { "wildcard": { "b": "*a rather*long*expression of*various*words" }}
                        ]
                 }
            }
       }
   }
}

The query takes about 500 ms to execute on an index with 3 mil documents with 4 even shards on ES 5.3.2.
The query returns no results for these particular terms and wildcard expression, which is fine.
If I remove the wildcard query, the query returns in 2 ms, also with no results (again, this is fine).
I tried placing the wildcard query in different positions inside the bool, to no avail.

I thought that the wildcard filter, combined with the term filters, would execute efficiently, but it looks like the wildcard filter scans the entire index regardless of the term filters.

Is there any way to make this work efficiently?

Thanks,
Roman Margolis


(Roman Margolis) #2

I think I can answer my own question. From various sources, i gather that wildcard filter always scans the entire index, regardless what other filters are present, because they are cacheable.

So it appears there's no way to do what i want efficiently, out of the box. However, a rather simple solution can be built: a script wildcard filter (or wildcard plugin). Script filters are executed later in the query pipeline (I think), after the more cheap term filters, and operate on the already filtered bitset.

What do you think?


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.