Synonyms as option at query time

My process is to create an index offline and then move it to the production environment. Synonyms are easy enough to implement but that the risc is to create a lot of hits. I want to offer the users the option to use synonyms or not. The first problems I've come across are:

  • Mapping could not contain multiple document types
  • The analyzed field is the same in multiple mappings

I obviously do not want to create two indexes for the cases with and without synonyms.

Is there a best practice or some success-story about synonyms optionally at search time?

Hi @karejonsson,

You could create two different analyzers (with and without synonyms) and use multi-fields in the mapping:

PUT my_index
{
  "settings": {
    "analysis": {
      "filter": {
        "synonym": {
          "type": "synonym",
          "synonyms": [
            "universe, cosmos"
          ]
        }
      },
      "analyzer": {
        "analyzer_with_synonyms": {
          "tokenizer": "standard",
          "filter": [
            "synonym"
          ]
        },
        "analyzer_without_synonyms": {
          "tokenizer": "standard",
          "filter": []
        }
      }
    }
  },
  "mappings": {
    "doc": {
      "properties": {
        "field1": {
          "type": "text",
          "analyzer": "analyzer_without_synonyms",
          "fields": {
            "synonym": {
              "type": "text",
              "analyzer": "analyzer_with_synonyms"
            }
          }
        }
      }
    }
  }
} 

When you search in field1 it won't use synonyms:

POST my_index/doc/1
{
  "field1": "universe"
}

GET my_index/_search
{
  "query": {
    "match": {
      "field1": "cosmos"
    }
  }
}

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

And if you search in field1.synonym it will use synonyms:

GET my_index/_search
{
  "query": {
    "match": {
      "field1.synonym": "cosmos"
    }
  }
}

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.46029136,
    "hits": [
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "1",
        "_score": 0.46029136,
        "_source": {
          "field1": "universe"
        }
      }
    ]
  }
}

Hope it helps.

Cheers,
Luiz Santos

Thanks Luiz
I made an implementation according to your details and I am very satisfied with it. The solution I found on my own had double fields with double analysis to fit into what I knew about elastic. This is so much better.
Kind regards,
Kåre Jonsson

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.