How to do index-time synonyms


(Chung) #1

I am trying to make a synonym for a category so that voucher and gift card mean the same thing in index time. So when users type in voucher or gift card, elasticsearch returns the same results.

For some reasons, the settings and mappings I am using don't work at all.

What did I miss in the configs?

settings

{
  "number_of_replicas": 0,
  "analysis": {
    "filter": {
      "category_synonym": {
        "type": "synonym",
        "tokenizer": "keyword",
        "synonyms": [
          "voucher, gift card"
        ]
      }
    },
    "analyzer": {
      "keyword_analyzer": {
        "type": "custom",
        "tokenizer": "keyword",
        "filter": [
          "lowercase",
          "category_synonym"
        ]
      }
    }
  }
}

mapping

{
  "properties": {
    "category": {
      "type": "keyword",
      "analyzer": "keyword_analyzer"
    }
  }
}

#2

What is the result of :

GET /index/_analyze
{
  "analyzer" : "keyword_analyzer",
  "text" : "voucher"
}

(Chung) #3

I got

{
  "tokens": [
    {
      "token": "voucher",
      "start_offset": 0,
      "end_offset": 7,
      "type": "word",
      "position": 0
    },
    {
      "token": "gift card",
      "start_offset": 0,
      "end_offset": 7,
      "type": "SYNONYM",
      "position": 0
    }
  ]
}

#4

As you can see, if your document contains "voucher", it will be indexed with "voucher" and "gift card", as you defined in your mapping, so for me it's all good.


(Chung) #5

I am not sure why it didn't work. Do I have to add something special to the query?

GET myindex/_search 
{
    "query": {
        "bool" : {
            "must" : {
                "query_string" : {
                    "query" : "gift card"
                }
            }
        }
    }
}
returned correct results
GET myindex/_search
{
    "query": {
        "bool" : {
            "must" : {
                "query_string" : {
                    "query" : "voucher" # or Voucher
                }
            }
        }
    }
}
returned nothing

#6

No the query looks fine, so you need to debug this, and try to understand the reason why the second query doesnt return anything.

Add the explain parameter to understand how the score is calculated, and search only on the category field to be sure, the results of the first query are coming from this field.

POST index/_search
{
  "explain": true,
  "_source": [
    "category"
  ],
  "query": {
    "query_string": {
      "fields": [
        "category"
      ],
      "query": "gift card"
    }
  }
}

POST index/_search
{
  "explain": true,
  "_source": [
    "category"
  ],
  "query": {
    "query_string": {
      "fields": [
        "category"
      ],
      "query": "voucher"
    }
  }
}

(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.