Elasticsearch synonyms and boost by category


(Juanjo Aguilella) #1

I have an issue using elasticsearch 5.0 wasting a lot of time in this issue may be you can help me.

I have and index with a lot of products and I set a boost based on some categories with this search:

{ 
  "from": 1,
  "size": 10
  "query": {
      "bool": {
           'must': {
                'match': {
                    'NAME': {
                         "query": "tv",
                         "operator": "and"
                    }
                },
           },
           "should": { 
                "query_string": {
                    "query": 'category: 23',
                    "boost": 2
                }
           }
       }
}

This work fine, and the scoring of the results has been changed for these category.

Now we wanna introduce the synonyms for the search and we set the synoynyms as a analyzer in elasticsearch using this method:

"analysis": {
    "analyzer": {
         "synonym": {
              "tokenizer": "whitespace",
              "filter": ["lowercase", "asciifolding", "synonym_filter"]
         }
    },
    "filter": {
        "synonym_filter": {
            "type": "synonym",
            "language": "spanish",
            "synonyms": [
                "tv, television, tdt"
             ]
        }
    }
}

We change the query to use the new analyzer and this work fine:

{
    "from": 1,
    "size": 10
    'query': {
        'bool': {
            'must': {
                'match': {
                    'NAME': {
                        "query": "television",
                        "operator": "and",
                        "analyzer": "synonym"
                     }
                }
            }
        }
    }
}

But when we try to apply boost to this query the result doesnt vary

{
    "from": 1,
    "size": 10
    'query': {
        'bool': {
            'must': {
                'match': {
                    'NAME': {
                        "query": "television",
                        "operator": "and",
                        "analyzer" => "synonym"
                    }
                 },
             },
             "should": { 
                 "query_string": {
                     "query": 'category: 23',
                     "boost" => 2
                 }
             }
        }
    }
}

Can someone help me?

Regards.


(Martijn Van Groningen) #2

Does the ordering of the hits not change or do the scores of the hits not change?
The boost should have an effect on the score, perhaps it is small, you can see it if you enable query explanation.

I think because you have enabled synonyms at query time the weight of the match query has becomes more important (hits can have up to 3 matches there for tv, television and tdt) than the should clause on category. It seems that a match on category is more important than on the name for your ranking, so maybe you can boost down the match query by adding a boost of 0.5 (or smaller, you need to play around with it) and remove the boost on category.

Also for the query on category you can just use a term query instead of a query_string query as the query parsing that query_string does is not needed in your case.


(David Pilato) #3

BTW just a note, from: 1 means that you are skipping the first result. The most relevant one... I guess you don't want to do this.


(Juanjo Aguilella) #4

@dadoonet thanks by your annotation, I copied the source code and changed the variable by values to make the example more readable. :wink:


(Juanjo Aguilella) #5

@mvg thanks for your comments, seems that works like you say, the values that we dont have in synonyms doesn't have the same scoring that the others that we dont have. I need to put all synonyms values (accents included) to make it works correctly. With this modification the result is more than satisfactory.

Regards and a lot of thanks :wink:


(JavaES) #6

HI JuanjoAguilella / David pilato / Martjin
I'm also facing related issue to synonym analyzer could you please help me by looking my post .Thanks for your help in advance.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.