Higher score to (boost) documents carrying a certain field value


#1

I am fairly new to ElasticSearch and getting my feet wet so please pardon if I missed any documentation and/or cannot see/understand the usage clearly.

Now, on-to the problem. Following is a simple default mapping I am using to index documents (products) with their associated brand name.

$params = [
'index' => 'test_index',
'body' => [
  'mappings' => [
	'_default_' => [
	  'properties' => [
		'brand' => [
		  'properties' => [
			'name' => [
			  'type' => 'text',
			  'fields' => [
				'keyword' => [
				  'type' => 'keyword',
				  'ignore_above' => 256
				]
			  ]
			],
			'private' => [
			  'type' => 'byte',
			  'fields' =>[
				'keyword' => [
				  'type' => 'keyword',
				  'ignore_above' => 256
				]
			  ]
			]
		  ]
		],
		'product' => [
		  'properties' => [
			'id' => [
			  'type' => 'integer',
			  'index' => 'not_analyzed'
			],
			'name' => [
			  'type' => 'text',
			  'fields' => [
				'keyword' => [
				  'type' => 'keyword',
				  'ignore_above' => 256
				]
			  ]
			]
		  ]
		]
	  ]
	]
  ]
]
];

The private field has 2 possible values 1 and 0.

As part of the search query, I am trying to push private brands (brands.private) (1) before the non-private brands (0).

Using the query below,

'{
    "index": "test_index",
    "type": "test_type",
    "explain": true,
    "body": {
        "from": 0,
        "size": 20,
        "query": {
            "bool": {
                "must": {
                    "0": {
                        "multi_match": {
                            "query": "some_query",
                            "type": "phrase",
                            "fields": {
                                "1": "product.name^4",
                                "2": "brand.name^3"
                            }
                        }
                    }
                },
                "should": {
                    "term": {
                        "brand.private": {
                            "value": 1,
                            "boost": 5
                        }
                    }
                }
            }
        }
    }
}'

I am able to boost score but the results still show up non private brands and rightly so since the overall score is still higher than private brands. This makes sense but I need to show private brands first and non private after them. Is there a way to accomplish this?

Another possible alternative I came across was the use of function score value but am not sure if it's the right way to go.


(Ivan Brusic) #2

If you need to show all private brands first, then I would sort on the
private field, with the score being a secondary sort. Make sure
track_scores is enabled (which it might be by default). All your private
brands will appear first, sorted by score, followed by the non-private
brands, also sorted by score.

I prefer using function scores over boosting via boolean queries since the
boost level is consistent. Boosting in a boolean query can turn out not to
be linear due to the way the Lucene scoring model works. With function
scores, you know a boost of 5 will boost the overall score by 5. With the
new painless scripting language, scripts should be cached and therefore the
performance penalty should be negligible.

Cheers,

Ivan


#3

Oh.Yeah. It never crossed my mind, was busy navigating the more complex solutions. Anyhow, so I tried sorting with the following query,

'{
    "index": "test_index",
    "type": "test_type",
    "body": {
        "from": 0,
        "size": 20,
        "query": {
            "bool": {
                "must": {
                    "0": {
                        "multi_match": {
                            "query": "some_query",
                            "type": "phrase",
                            "fields": {
                                "1": "product.name^4",
                                "2": "brand.name^3"
                            }
                        }
                    }
                }
            }
        },
        "track_scores": true,
        "sort": {
            "brand.private": {
                "order": "desc"
            },
            "_score": {
                "order": "desc"
            }
        }
    }
}'

Although I am getting the expected results. However using track_scores seems to have no impact at all except for the fact that not using track_scores show null in scores field. Is this the expected behavior of track_scores? Additionally, for the other approach you shared i.e. using function_scores, is it worth the effort as compared to current solution?

Thank you for sharing a solution!


(Ivan Brusic) #4

That is the purpose of track_scores. Lucene will not score documents if
they are not used as the primary sort.

The sort method is the right way to go for this use case. I added the side
note about function score queries since you mentioned using boosts in
boolean queries. I find function scores incredibly useful and are an
excellent way to influence behavior.


#5

Perfect! Appreciate the help :slight_smile:


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.