Analyzer not working as expected


(Ben) #1

I'm trying to run a query for the word "italy" (in a document which contains the word "italian") however it return no results. "italian" does return the result. I've setup an analyzer and have confirmed that it's working correctly. Please see my settings and query attempts below.

Thanks. Ben

   curl -u dev-user:foo -XGET 'https://foo.com:10724/myindex/_settings?pretty'
    {
      "myindex" : {
        "settings" : {
          "index" : {
            "creation_date" : "1444137978340",
            "analysis" : {
              "analyzer" : {
                "myanalyzer" : {
                  "type" : "custom",
                  "filter" : [ "standard", "lowercase", "stop", "kstem" ],
                  "tokenizer" : "standard"
                }
              }
            },
            "number_of_shards" : "5",
            "number_of_replicas" : "2",
            "version" : {
              "created" : "1050299"
            },
            "uuid" : "rawNPtrxSom_EVxVlw5X-g"
          }
        }
      }
    }
    
    curl -u dev-user:foo 'https://foo.com:10724/myindex/_mappings?pretty'
    {
      "myindex" : {
        "mappings" : {
          "story" : {
            "properties" : {
              "clips" : {
                "properties" : {
                  "description" : {
                    "type" : "string",
                    "analyzer" : "myanalyzer"
                  },
                  "title" : {
                    "type" : "string",
                    "analyzer" : "myanalyzer"
                  }
                }
              },
              "shortCode" : {
                "type" : "string"
              },
              "tags" : {
                "properties" : {
                  "_id" : {
                    "type" : "string"
                  },
                  "text" : {
                    "type" : "string",
                    "analyzer" : "myanalyzer"
                  }
                }
              },
              "title" : {
                "type" : "string",
                "analyzer" : "myanalyzer"
              }
            }
          }
        }
      }
    }
    
    curl -u dev-user:foo 'https://foo.com:10724/myindex/_analyze?analyzer=myanalyzer&text=italian&pretty'
    {
      "tokens" : [ {
        "token" : "italy",
        "start_offset" : 0,
        "end_offset" : 7,
        "type" : "<ALPHANUM>",
        "position" : 1
      } ]
    }
    
    curl -u dev-user:foo -XGET 'https://foo.com:10724/myindex/_search?pretty&q=italian'
    {
      "took" : 4,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 1,
        "max_score" : 0.076713204,
        "hits" : [ {
          "_index" : "myindex",
          "_type" : "story",
          "_id" : "560ac78e92e1d3806bd2eb2b",
          "_score" : 0.076713204,
          "_source":{"shortCode":"E19Pj621l","title":"italian businesses","tags":[{"text":"fashion","_id":"560e9ebddee0f4f913b7a1a1"},{"text":"dolce","_id":"560e9ebddee0f4f913b7a19e"},{"text":"clothes","_id":"560e9ebddee0f4f913b7a19d"}],"clips":[{"title":"Dolce & Gabbana","description":"<p>lorem</p>"}]}
        } ]
      }
    }
    
    curl -u dev-user:foo -XGET 'https://foo.com:10724/myindex/_search?pretty&q=italy'
    {
      "took" : 5,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 0,
        "max_score" : null,
        "hits" : [ ]
      }
    }

(Doug Turnbull) #3

It looks like you've set everything up correctly. Did you reindex after you modified your analyers/mappings?


(Ben) #4

Yes, nothing. Have tried with additional documents searching for "business" and "businesses" with the same sort of issues.


(Doug Turnbull) #5

Actually I'm thinking the search might actually be against the _all field. Try searching with the title field instead. I bet the _all field uses the standard analyzer.

try ?q=title:italian and ?q=title:italy and let me know if that matches


(Ben) #6

That worked - great, thanks. So how would I make this work on the _all field, or would I explicitly list all the fields to search against in the query?


(Ben) #7

I tried a multi match which works. Is this the most optimal way to do this, or is there an _all version?

multi_match: {
   query : "italy",
   fields : [ "title", "description", "tags", "clips"]
}

(system) #8