Boosting specific documents to be at the top of the results

Hi there,

I don't know how to best describe the topic, excuse the vague title.

I'm having some issue with the search results coming from an index.

The index contains about 1M+ documents (various products), but doing a text-search I get weird matching.

From those 1m+ products, there are about:

1000-1100 smartphones products,
3200 smartphone LCD spare-parts,
10k other smartphone related accessories.

It seems because the smartphone category contains less items than the other 2, I get mixed results. Is there a way to boost the smartphone products to come up a lot higher in the results and not get mixed results ?

for example, searching for "xiaomi redmi note 8"
I get as a 1st result an LCD replacement screen product then 2-3 actual phones then some more LCD and batteries parts. The score between them is quite similar and it doesn't matter the actual phone you're looking for, because each phone "title" is included in those other categories, multiple times (OEM LCD for "model title", Original LCD for "model title", Protective Case for "model title", Car mount for "model title" etc.)

Please let me know what kind of information you need from me.

Running ES7.8 on Ubuntu 20.04

This was a related subject which may be of interest

Thanks for replying.

I checked those 2 links provided there about pinned and boosting queries. Seems unrealistic for my situation where my index currently contains 2.4m documents/products and my target is at 6-7m. I'm also a 1-man army and my app has 0 income. I should also edit the initial post to include other restrictions, like the app that I'm trying include the ES is a Laravel Framework app (Scout/ScoutElastic).

The "problematic" results tho occur for products that can have accessories, consumables, and/or repair-parts. Especially, if the "primary" product categories contain less items than the accessories for those primary categories, eg. Smartphones (1k products) vs. Smartphone Accessories (150k products). It is logical, I guess, that since the search terms appear more often in accessories, that the engine will score those higher.

Other examples include the following...
Printers for example, have a lot of consumables.
Laptops have repair parts and accessories.
Phones have a $-ton more accessories and more repair parts than actual phone products.

Generally, any category of products you can think of that has accessories/repair-parts.

For down-boosting accessories/consumables this comment may be relevant.

"You can machine-learn the set of trigger keywords ..."

This escalated quickly to machine-learning... :rofl:

I wish I had the $$$ to hire an expert, but I'm just a PHP/Laravel app dev that is missing a great search experience on my app.

:slight_smile:

There's no worldly knowledge that ships with Elasticsearch which makes it understand iphone 64gb is a better match than iphone case for an iphone search. You need to use the APIs we offer to add that sort of understanding.

Learning words associated with cases/covers etc isn't too hard. It's a single request. Here's an example using the BestBuy query logs .

GET bestbuykaggle/_search
{
  "query": {
	"match": {
	  "query": "case"
	}
  },
  "size": 0,
  "aggs": {
	"sample": {
	  "sampler": {
		"shard_size": 10000
	  },
	  "aggs": {
		"words associated with cases": {
		  "significant_text": {
			"field": "query",
			"size": 200
		  }
		}
	  }
	}
  }
} 

You can do this on your own data (query logs or product descriptions will do).
This will give you a list of words like "iphone" that you should look for in all incoming user search strings and down-boost "case" matches on if the search string doesn't contain "case"

I get it... while also not getting it.. :rofl:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.