It is possible to set phrase queries on certain multi term synonyms?

Hi,

I have the following synonyms mapping on the query side:

ckd, chronic kidney disease

On the index side I have the following:

chronic kidney disease => ckd, chronic kidney disease

  1. If someones searches for 'ckd' it should only search for ckd and the phrase 'chronic kidney disease'

  2. If someone searches for chronic kidney disease it should search for 'ckd' and chronic kidney disease both as a phrase and as conjoined terms so +chronic +kidney +disease anywhere in the document

Everything works great expect point 1, in order to get this to work I have to set auto_generate_synonyms_phrase_query to true but when I do this I lose the requirements in point 2 and lose the docs where the terms chronic, kidney and disease appear separately.

Is there any way around this?

Thanks

You are running into an interesting challenge with multi-word synonyms, and that is that if you use a multi-word query for a phrase that has been defined as a synonym, in a match query it will be treated as a phrase. This breaks the usual behavior of match queries, where each of the search terms is treated as being in a simple OR query.

The easiest way to work around this, is to use query time synonyms instead of index time synonyms (preferably using the synonym_graph token filter). This will allow you to rewrite your query to a bool query, that searches for the terms with synonyms and without synonyms at the same time. As a result, you will find documents containing synonyms as well as documents that contain the individual search terms.

So, given an index with the following settings and mappings:

PUT ckd
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonyms": {
          "type": "synonym_graph",
          "synonyms": [
            "ckd, chronic kidney disease"
          ]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "my_synonyms"
          ]
        }
      }
    }
  },
  "mappings": {
    "_doc": {
      "properties": {
        "my_field": {
          "type": "text",
          "analyzer": "standard",
          "search_analyzer": "my_analyzer"
        }
      }
    }
  }
}

The following queries should return the desired results:

GET ckd/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "my_field": {
              "query": "ckd"
            }
          }
        },
        {
          "match": {
            "my_field": {
              "query": "ckd",
              "analyzer": "standard",
              "operator": "and"
            }
          }
        }
      ]
    }
  }
}

GET ckd/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "my_field": {
              "query": "chronic kidney disease"
            }
          }
        },
        {
          "match": {
            "my_field": {
              "query": "chronic kidney disease",
              "analyzer": "standard",
              "operator": "and"
            }
          }
        }
      ]
    }
  }
}

Thanks @abdon I shall give this a go!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.