Query_string and wildcards behaves incorrectly?


(Christoph Seiler) #1

So for a client we created an editor to define filters via an UserInterface. The Queries "work" but for at least one case where it's strangely incorrect:

We want a 'real' contains so that if we are searching in "youtube videos", we hit: tub, tube, youtube and youtube videos ...

the approach is the following:

Base Query:
{
  "query": {
	"filtered": {
	  "query": {
		"bool": {
		  "must": [
			{
			  "bool": {
				"should": [
				  {
					"query_string": {
					  "default_field": "activityType_de",
					  "query": "*tub*",
					  "boost": 1.0
					}
				  }
				],
				"minimum_should_match": 1
			  }
			}
		  ]
		}
	  }
	}
  },
  "size": 50
}

Result is:
{
	"took": 2,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"hits": {
		"total": 1,
		"max_score": 1,
		"hits": [
			{
				"_index": "simple",
				"_type": "activity",
				"_id": "Activity_235",
				"_score": 1,
				"_source": {
					...
					"activityType_de": "youtube video",
					"activityType_en": "youtube video",
					"activityType_it": "youtube video",
					"activityType_fr": "vidéo youtube",
					"activityType": "508",
					...
					"name": "video nummer 1",
				}
			}
		]
	}
}

which is correct, but if we search for youtube:
{
  "query": {
	"filtered": {
	  "query": {
		"bool": {
		  "must": [
			{
			  "bool": {
				"should": [
				  {
					"query_string": {
					  "default_field": "activityType_de",
					  "query": "*youtube*",
					  "boost": 1.0
					}
				  }
				],
				"minimum_should_match": 1
			  }
			}
		  ]
		}
	  }
	}
  },
  "size": 50
}

we get zero hits.


"query": "youtube" -> works
"query": "*youtube*" -> does not work
"query": "tube" -> does not work
"query": "*tube" -> does not work
"query": "*tube*" -> does not work
"query": "*tub*" -> works

Can anybody tell me where my mistake is?


(Mark Walkom) #2

I think you need to wrap your code in the proper formatting, all we can see is italics.


(Christoph Seiler) #3

better?


(Mark Walkom) #4

Yeah, thanks!
FYI using a *foobar* search is seriously inefficient, it's akin to a table scan.

Are you setting specific mappings on the various activityType_ fields?


(Christoph Seiler) #5

Yeap:

{
  "simple" : {
    "mappings" : {
      "_default_" : {
        "dynamic_templates" : [ {
          "de" : {
            "mapping" : {
              "analyzer" : "german",
              "type" : "string"
            },
            "match" : "*_de",
            "match_mapping_type" : "string"
          }
        }, {
          "en" : {
            "mapping" : {
              "analyzer" : "english",
              "type" : "string"
            },
            "match" : "*_en",
            "match_mapping_type" : "string"
          }
        }, {
          "it" : {
            "mapping" : {
              "analyzer" : "italian",
              "type" : "string"
            },
            "match" : "*_it",
            "match_mapping_type" : "string"
          }
        }, {
          "fr" : {
            "mapping" : {
              "analyzer" : "french",
              "type" : "string"
            },
            "match" : "*_fr",
            "match_mapping_type" : "string"
          }
        } ],
        "properties" : { }
      },
      "activity" : {
        "dynamic_templates" : [ {
          "de" : {
            "mapping" : {
              "analyzer" : "german",
              "type" : "string"
            },
            "match" : "*_de",
            "match_mapping_type" : "string"
          }
        }, {
          "en" : {
            "mapping" : {
              "analyzer" : "english",
              "type" : "string"
            },
            "match" : "*_en",
            "match_mapping_type" : "string"
          }
        }, {
          "it" : {
            "mapping" : {
              "analyzer" : "italian",
              "type" : "string"
            },
            "match" : "*_it",
            "match_mapping_type" : "string"
          }
        }, {
          "fr" : {
            "mapping" : {
              "analyzer" : "french",
              "type" : "string"
            },
            "match" : "*_fr",
            "match_mapping_type" : "string"
          }
        } ],
        "properties" : {
		  ...,
          "activityType" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "activityType_de" : {
            "type" : "string",
            "analyzer" : "german"
          },
          "activityType_en" : {
            "type" : "string",
            "analyzer" : "english"
          },
          "activityType_fr" : {
            "type" : "string",
            "analyzer" : "french"
          },
          "activityType_it" : {
            "type" : "string",
            "analyzer" : "italian"
          },
		  ...,
          "name" : {
            "type" : "string",
            "index" : "not_analyzed"
          }
        }
      }
    }
  }
}

(system) #6