Having trouble with synonyms and match query

I'm trying to search using match query the following products :

1 - Suco SUCO E SÓ Tangerina Integral;
2 - Suco SUCO E SÓ Laranja Integral;
3 - Suco SUCO E SÓ Uva Integral 300ml

The query :

"query": {
    "bool": {
      "must": [
        {
          "match": {
            "name.synonyms": {
              "query": "suco laranja",
              "operator" : "and"
            }
          }
        }
      ]
    }
  }

The problem: The query above is returning me the products 1 and 3, while in my thoughts it should only bring the product 2. Because the match query operator is "AND" so it should search for products like " 'suco' or any matching synonyms of 'suco' " AND " 'laranja' or any matching synonyms of 'laranja' ". Isn't that right ?

I want to know why the matches include products that don't have the word 'laranja' in their names;

My settings :

{
  "settings": {
    "analysis": {
      "filter": {
        "synonym_br": {
          "type": "synonym",
          "synonyms": [
            "suco => refresco, bebida de soja"
          ]
        },
        "brazilian_stop": {
          "type": "stop",
          "stopwords": "_brazilian_"
        }
      },
      "analyzer": {
        "synonyms": {
          "filter": [
            "lowercase",
            "synonym_br",
            "brazilian_stop",
            "asciifolding"
          ],
          "type": "custom",
          "tokenizer": "standard"
        }
      }
    }
  }
}

Without seeing the documents content and the mapping for this index (what doesn name.synonyms store?) it is a bit hard to guess why those are returned, but you can get an idea how the documents in question (1 and 3) get a positive score at all by running the query on them using the explain API. In your case this would be something like:

GET /yourIndexNameHere/_explain/1 <- put the document id last here
{
"query": {
    "bool": {
      "must": [
        {
          "match": {
            "name.synonyms": {
              "query": "suco laranja",
              "operator" : "and"
            }
          }
        }
      ]
    }
  }
}

The output can be a bit verbose, but fortunately you don't have to understand the exact score calculation but probably see where there are positive matches of clauses with your document that you don't expect to be there. I'd start further debugging from there.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.