Generating same token for related words

Hello everybody.

I would like to know if it is possible using analyzer to generate the same token for the following words: "bronzeadora", "bronze", "bronzeado". The token I need for the three words would be "bronz".

I tried using the stemmer filter but I was unsuccessful as you can see:

GET _analyze
{
  "text": [
    "bronzeadora",
    "bronze",
    "bronzeado"
  ],
  "tokenizer": "standard",
  "filter": [
    {
      "type": "stemmer",
      "language": "brazilian"
    }
  ]
}

Token:

{
  "tokens": [
    {
      "token": "bronzeador",
      "start_offset": 0,
      "end_offset": 11,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "bronz",
      "start_offset": 12,
      "end_offset": 18,
      "type": "<ALPHANUM>",
      "position": 101
    },
    {
      "token": "bronze",
      "start_offset": 19,
      "end_offset": 28,
      "type": "<ALPHANUM>",
      "position": 202
    }
  ]
}

I know I can solve the problem with synonyms but I wanted to make sure there isn't some other filter.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.