So I am trying to build an e-commerce like engine and I'd like to improve my queries / indexing to cater for the following scenario.
Sample index using english_minimal to match stuff like dress
with dresses
but not dresser
:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"filter_english_minimal"
]
}
},
"filter": {
"filter_english_minimal": {
"type": "stemmer",
"name": "minimal_english"
}
}
}
}
}
Sample documents:
[{
"name": "Dress shirt red",
"categories": ["clothing","dress shirt"]
}, {
"name": "Mini dress black",
"categories": ["clothing","dresses"]
}, {
"name": "Chelsea boots",
"categories": ["clothing","dress boots"]
}, {
"name": "Floral dress",
"categories": ["clothing","dress"]
}, {
"name": "Midi white dress",
"categories": ["clothing","midi dress"]
}]
My current query is:
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"multi_match": {
"query": "dress",
"minimum_should_match": "100%",
"type": "cross_fields"
}
},
{
"multi_match": {
"query": "dress",
"minimum_should_match": "100%",
"type": "best_fields",
"fuzziness": "AUTO",
"prefix_length": 2
}
}
]
}
}
],
"should": [
{
"multi_match": {
"query": "dress",
"type": "phrase",
"boost": 6,
"slop": 5
}
}
]
}
}
}
which returns the following:
{
"took": 602,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 5,
"relation": "eq"
},
"max_score": 15.492477,
"hits": [
{
"_index": "test",
"_id": "y4yCMX8BIpIN3Vwrn8m_",
"_score": 15.492477,
"_source": {
"name": "Floral dress",
"categories": [
"clothing",
"dress"
]
}
},
{
"_index": "test",
"_id": "yIyBMX8BIpIN3Vwrocm-",
"_score": 3.081841,
"_source": {
"name": "Dress shirt red",
"categories": [
"clothing",
"dress shirt"
]
}
},
{
"_index": "test",
"_id": "zIyCMX8BIpIN3Vwr_sm3",
"_score": 3.081841,
"_source": {
"name": "Midi white dress",
"categories": [
"clothing",
"midi dress"
]
}
},
{
"_index": "test",
"_id": "yoyCMX8BIpIN3VwrRclb",
"_score": 2.9274213,
"_source": {
"name": "Chelsea boots",
"categories": [
"clothing",
"dress boots"
]
}
},
{
"_index": "test",
"_id": "yYyBMX8BIpIN3Vwr6cme",
"_score": 1.7833835,
"_source": {
"name": "Mini dress black",
"categories": [
"clothing",
"dresses"
]
}
}
]
}
}
Is there any way to get the category dresses
rank higher than dress boots
or dress shirt
?