So I have data related to Products, Orders & Purchase Orders in their own Relational Databases.
Each type of Data i.e. Products, Orders etc have about 4-5 Fields, based on which an end user would perform Search (Search has to be across Products, Orders and Purchase Order).
We have a global search feature( search box ) which must return relevant results to the search term provided by the User.
The User could potentially input search term with uppercase or lowercase, but the search engine must account for it.
Not only this, the search engine must also give results based on partial search term entered by the user. This is to say that user should be able to enter partial search term(orderId, productName, PO Number) and still get hits.
To Address this use case, I have created 3 indices for Product, Orders and Purchase Orders.
The Mapping For Each Index Looks similar to this
{
"settings": {
"analysis": {
"analyzer": {
"lowercasespaceanalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"properties": {
"purchaseOrderNumber": {
"type":"text",
"analyzer": "lowercasespaceanalyzer"
},
"partNumber": {
"type":"text",
"analyzer": "lowercasespaceanalyzer"
},
"createdAt": {
"type": "date",
"format": "strict_date_optional_time"
}
}
}
}
And the query used is
{
"query": {
"query_string": {
"query": "*HEAVENLUXE-SG23FEB2022*",
"fields": [
"modelName^2",
"shortName^4",
"channelSkuId^3",
"unaSkuId^1",
"optionalSellerOrderId^3",
"internalOmsOrderId^1",
"sellerOrderId^4",
"partNumber^4",
"purchaseOrderNumber^3"
]
}
}
}
This was working mostly, except for the fact that wildcard used in the query_string was yielding results as per substring matches in the top result set.
I need the exact matches to come higher up on the result set. If and only if there is no exact match should substring results be included in the result set.