Using two analyzers stemmer and synonym at a same time


(adish) #1

I have created an index with analyzers stemmer and synonym.
I have added a mapping for a field which contains these two analyzers.

{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "my_stemmer": {
            "tokenizer": "standard",
            "filter": [
              "standard",
              "lowercase",
              "custom_english_stemmer"
            ]
          },
          "myr_synonym": {
            "tokenizer": "whitespace",
            "filter": [
              "synonym"
            ]
          }
        },
        "filter": {
          "custom_english_stemmer": {
            "type": "stemmer",
            "name": "english"
          },
          "synonym": {
            "type": "synonym",
            "format": "wordnet",
            "synonyms_path": "wn_s.pl"
          }
        }
      }
    }
  },
  "mappings": {
    "_default_": {
      "properties": {
        "name": {
          "type": "string",
          "fields": {
            "synonym":{
              "type":     "string",
              "analyzer": "my_synonym"
            },
            "my_stemmer":{
              "type":     "string",
              "analyzer": "my_stemmer"
            }
          }
        }
      }
    }
  }
}

I am using multi_match query:
E.g. result contains belt in name.

Suppose I queried as :

{
      "query": {
          "multi_match": {
              "query": "knock",
              "fields": [
                  "name.synonym^2",
                  "name.my_stemmer^2"
                 ]
          }
      }
  }

it will fetch results of belt as knock is synonym for belt.
But if I use "knocks" results are empty. Can I apply both stemmer as well as synonym on single field at a time?


(David Kemp) #2

I think you may have a number of options.

For your particular scenario, you could search on the synonym field but force the query to use the stemmer:

{
  "query": {
    "match": {
      "name.synonym": {
        "query": "knocks",
        "analyzer": "my_stemmer"
      }
    }
  }
}

Alternatively, you could define and use an analyser that applies the synonym filter after the stemmer:

  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "analyzer_combo": {
            "tokenizer": "standard",
            "filter": [
              "standard",
              "lowercase",
              "custom_english_stemmer",
              "synonym"
            ]
          },
....

And be aware that you can declare in your mapping for a field a different query analyzer to the index analyzer used for the field. For example, you could declare a third multi field whose index analyzer is the "analyzer_combo" I defined above, but that the query analyzer is to be "my_stemmer"

    "name": {
      "type": "string",
      "fields": {
        "another_field": {
          "type": "string",
          "index_analyzer": "analyzer_combo",
          "query_analyzer": "my_stemmer"
        }
      }

Hope this helps.


(adish) #3

Hi,
Thanks for your valuable input. That really worked fine for me :smiley:
This worked for the simple query:

 {
  "query": {
    "match": {
      "name.synonym": {
        "query": "knocks",
        "analyzer": "my_stemmer"
      }
    }
  }
}

but now what if The query is more complex and use of filtering in it? I tried different ways but unable to add
analyzer with it.

{
 "min_score": 0.5,
 "query": {
   "filtered": {
     "filter": {
       "and": {
         "filters": [
           {
             "terms": {
               "tags.combo": [
                 "belt"
               ]
             }
           }
         ]
       }
     },
     "query": {
       "match_all": {         
       }
     }
   }
 }
}

where combo is the mapped field to tags:

"tags": {
           "type": "string",
           "fields": {
                         "combo": {
                         "index_analyzer": "analyzer_combo",
                         "type": "string"
                      }
           }
}

Is it possible to specifically add analyzer in the query?
Thanks in advance


(system) #4