[solved] Wildcard search with special characters: Differences between ES 1.7 and ES 5.1

I'm in the process of upgrading a project from ES 1.7 to ES 5.1 and noticed an integration test failing.
Below is a minimal repro case.

When searching using a wildcard search for "brö" I get in both versions the same result. When searching using "brö{"/)?" which gets escaped as "brö\{\\\"\/\)", I get 2 results in ES 1.7 and no results in ES 5.1.

I searched breaking changes of ES 2.x and ES 5.x but could not find anything related to that. Also searched for new query string search settings which could help in that regard but couldn't find it either.
If you have any idea what's going on, please let me know :slight_smile: Thanks.

Repro case:

ES 5 index creation

curl -X PUT 'http://localhost:9200/testwildcard' -d '{
  "mappings": {
    "testwildcard": {
      "properties": {
        "name": {
          "type": "keyword",
          "fields": {
            "search": {
              "type": "text"
            }
          }
        }
      }
    }
  }
}'

ES 1.7 index creation

curl -XPUT 'http://localhost:9200/testwildcard' -d '{
  "mappings": {
    "testwildcard": {
      "properties": {
        "name": {
          "type": "string",
          "analyzed": "not_analyzed",
          "fields": {
            "search": {
              "type": "string"
            }
          }
        }
      }
    }
  }
}'

Insert data

curl -XPUT 'http://localhost:9200/testwildcard/testwildcard/1' -d '{
 "name": "Brötchen"
}'

curl -XPUT 'http://localhost:9200/testwildcard/testwildcard/2' -d '{
 "name": "Frischbackbrötchen"
}'

curl -XPUT 'http://localhost:9200/testwildcard/testwildcard/3' -d '{
 "name": "Brot"
}'

works in both ES 1.7 and ES 5.1:

curl -XPOST 'http://localhost:9200/testwildcard/_search' -d '{
  "explain": true,
  "query": {
    "query_string": {
      "query": "name.search:*brö*",
      "analyze_wildcard": true,
      "default_operator": "AND"
    }
  }
}'

doesn't work in ES 5.1 but works in ES 1.7:

curl -XPOST 'http://localhost:9200/testwildcard/_search' -d '{
  "explain": true,
  "query": {
    "query_string": {
      "query": "name.search:*brö\\{\\\\\\\"\\/\\)*",
      "analyze_wildcard": true,
      "default_operator": "AND"
    }
  }
}'

Maybe it is interesting that simple_query_string wildcard search works, with the peculiar symbols.

In simple_query_string you can have only prefix search (leading * is not allowed).

Example

DELETE /test

PUT /test
{
  "mappings": {
    "testwildcard": {
      "properties": {
        "name": {
          "type": "text",
          "fields": {
            "search": {
              "type": "text"
            }
          }
        }
      }
    }
  }
}

PUT /test/testwildcard/1
{
 "name": "Brötchen"
}

PUT /test/testwildcard/2
{
 "name": "Frischbackbrötchen"
}

PUT /test/testwildcard/3
{
 "name": "Brot"
}

POST /test/_search

POST /test/_search
{
  "explain": true,
  "query": {
    "simple_query_string": {
      "query": "brö*",
      "fields" : [ "name.search" ],
      "analyze_wildcard": true,
      "default_operator": "AND"
    }
  }
}

POST /test/_search
{
  "explain": true,
  "query": {
    "simple_query_string": {
      "query": "*brö*",
      "fields" : [ "name.search" ],
      "analyze_wildcard": true,
      "default_operator": "AND"
    }
  }
}

POST /test/_search
{
  "explain": true,
  "query": {
    "simple_query_string": {
      "query": "*brö\\{\\\\\\\"\\/\\)*",
      "fields" : [ "name.search" ],
      "analyze_wildcard": true,
      "default_operator": "AND"
    }
  }
}

All queries give 1 hit (Brötchen).

Thanks for your reply. That's interesting indeed and I can confirm that.

I just tried if it's still working like expected in 2.x and it does indeed. So it seems to be a change between 2.x and 5.x.
Maybe the lucene upgrade from 5.x to 6.0.0?

https://lucene.apache.org/core/6_4_0/changes/Changes.html#v6.0.0

It'd be worth raising this on Github, if you haven't already :slight_smile:

Didn't yet. Opened one now: https://github.com/elastic/elasticsearch/issues/22989

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.