Query with wildcard do not return result if the item has data repeat more than 2 times

Hi Elastic team & community,
I got a trouble with wildcard query do not return expected result when having items with repeated data.
With the code below I expected that search query returns 2 items but it returns only 1.

Please help me to update the query to get the right result ?


## CREATE SHORT DATA
POST /test/_doc
{
  "content": "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><"
}

# CREATE LONG DATA
POST /test/_doc
{
  "content": "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+-=]¥}|”;:/><"
}

# QUERY
POST /test/_search 
{  "size": 10000,
  "query": {
    "bool": {
      "should": [
        {"wildcard": { "content.keyword": {
          "value": "*PQRSTUVWXYZ*"
        }}}
      ]
    } 
  }
}

Thank you very much!!!

If you are using the default mappings the keyword field comes with a default ignore_above set to 256. If your string is longer than that, which I suspect may be the case for your second document, it will not be indexed and therefore not found in the search.

1 Like

Note the new wildcard field has been designed to overcome the size limits of keyword and may be of interest (see the decision chart at the end of the blog though).

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.