QueryString vs multiple wildcards

I can't find any documentations that talks about queries and their performance/comparison

I'm wondering which is better performance/faster multiple wildcard filter or a string_query?

"query": {
    "bool" : {
      "must" : [
        {
          "query_string" : {
            "query" : "*val1* OR *val2*",
            "default_field" : "field",
            "fields" : [ ],
            "type" : "best_fields",
            "default_operator" : "or",
            "max_determinized_states" : 10000,
            "enable_position_increments" : true,
            "fuzziness" : "AUTO",
            "fuzzy_prefix_length" : 0,
            "fuzzy_max_expansions" : 50,
            "phrase_slop" : 0,
            "escape" : false,
            "auto_generate_synonyms_phrase_query" : true,
            "fuzzy_transpositions" : true,
            "boost" : 1.0
          }
        }
      ],
      "adjust_pure_negative" : true,
      "boost" : 1.0
    }
  }

or

"query": {
    "bool" : {
      "filter" : [
        {
          "bool" : {
            "should" : [
              {
                "wildcard" : {
                  "field" : {
                    "wildcard" : "*val1*",
                    "boost" : 1.0
                  }
                }
              },
              {
                "wildcard" : {
                  "field" : {
                    "wildcard" : "*val2*",
                    "boost" : 1.0
                  }
                }
              }
            ],
            "adjust_pure_negative" : true,
            "boost" : 1.0
          }
        }
      ],
      "adjust_pure_negative" : true,
      "boost" : 1.0
    }
  }

What is the type of the field field?

2 cases, one text and the other keyword.

It will be super slow whichever query you use.

What is your use case?

Have a look at Keyword type family | Elasticsearch Guide [8.8] | Elastic

1 Like

We’re dealing with both cases, same query will run for both text and keyword, still the question is which is better, multiple wildcards or queries_string?

Wildcard queries, especially with leading wildcards, are the most inefficient types of queries you can run in Elasticsearch, at least as long as you are not using the new wildcard field type. Both of these queries will perform and scale badly, so what you are asking is which one is least awful. I would recommend you benchmark them and see for yourself as it is likely to depend on the data and cluster specification etc.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.