Hi,
I have a query which works as I intended however when trying to replicate the query in DSL I have been unable to get the generated query to match and so the results are obviously incorrect.
What I need to achieve is:
- As seen in the code below weights are added to the relevant fields
- A single search box in the UI allows searching against any field (i.e a global search across 14 fields)
- smit dan -> Should match 'Dan Smith, Danny Smithe, Daniel Smithson'
- 536 (as a string) -> should match '536, 5367, 53678'
- Exact matches are scored highest followed by closest match
- The 'Must: Match' sub queries MUST match exactly
- The 'Must: Should' queries SHOULD match but they BOTH should match not just 1
I tried:
- Phrase_Prefix: Did not work well for searches such as 536, an exact match 536 was scored low and ANY partial match was scored much higher
- Fuzzy: Was not accurate enough for my use case, 'Smit' would return 'Smil, smile, smol' for example before 'Smith' even with 'transpositions' off and a low 'fuzziness'.
I build the query up dynamically (addtional 'filters' can be added/removed on the UI i.e specificy some or all categories to search in) into a VAR like this:
var mustMatchFilters = new List<Func<QueryContainerDescriptor<Doc>,QueryContainer>>();
The query is then executed like this:
var searchResponse = await elasticClient.SearchAsync<PatientDoc>(s => s
.Query(q => q
.Bool(b => b
.Must(mustMatchFilters)
)
)
.Size(request.ResultsLimit)
);
I'm trying to create a query using DSL to match this query:
"query": {
"bool": {
"must": [
{
"match": {
"active": {
"query": "true"
}
}
},
{
"bool": {
"should": [
{
"query_string": {
"analyze_wildcard": true,
"default_operator": "and",
"fields": [
"idText^10",
"lastName^9",
"firstName^8",
"email",
"title",
"address"
],
"query": "smit*"
}
},
{
"multi_match": {
"fields": [
"idText^10",
"lastName^9",
"firstName^8",
"email",
"title",
"address"
],
"operator": "and",
"query": "smit*"
}
}
]
}
}
]
}
},
"size": 30
}
I only use 'multi_match' and 'query_string' together as I need to use the wildcard and without both queries the scoring is all exactly the same so an additional sub-query was added to counteract that.
I have the following in DSL:
shouldMatchFilters.Add(f => f
.Bool(b => b
.Should(s => s
.QueryString(q => q
.Query($"{queryPart}*")
.Fields(fs => fs
.Field(f => f.IdText,10)
.Field(f => f.LastName,9)
.Field(f => f.FirstName,8)
.Field(f => f.Email)
.Field(f => f.Title)
.Field(f => f.Address)
)
.DefaultOperator(Operator.And)
.AnalyzeWildcard(true)
)
)
)
);
shouldMatchFilters.Add(f => f
.Bool(b => b
.Should(s => s
.MultiMatch(m => m
.Operator(Operator.And)
.Query($"{queryPart}*")
.Fields(fs => fs
.Field(f => f.IdText,10)
.Field(f => f.LastName,9)
.Field(f => f.FirstName,8)
.Field(f => f.Email)
.Field(f => f.Title)
.Field(f => f.Address)
)
)
)
)
);
This generates the query as below:
"query": {
"bool": {
"must": [
{
"match": {
"active": {
"query": "true"
}
}
},
{
"bool": {
"should": [
{
"query_string": {
"analyze_wildcard": true,
"default_operator": "and",
"fields": [
"idText^10",
"lastName^9",
"firstName^8",
"email",
"title",
"address"
],
"query": "smit*"
}
}
]
}
},
{
"bool": {
"should": [
{
"multi_match": {
"fields": [
"idText^10",
"lastName^9",
"firstName^8",
"email",
"title",
"address"
],
"operator": "and",
"query": "smit*"
}
}
]
}
}
]
}
},
"size": 30
}
The 'Should' queries are now seperate i.e Q1: 'bool => should' Q2: 'bool => should' instead of Q1: 'bool => [should && should]
I tried correcting this by adding them together in DSL as below:
.Bool(b => b
.Should(s => s
.QueryString(q => q
.Query($"{queryPart}*")
.Fields(fs => fs
.Field(f => f.IdText,10)
.Field(f => f.LastName,9)
.Field(f => f.FirstName,8)
.Field(f => f.Email)
.Field(f => f.Title)
.Field(f => f.Address)
)
.DefaultOperator(Operator.And)
.AnalyzeWildcard(true)
)
)
.Should(s => s
.MultiMatch(m => m
.Operator(Operator.And)
.Query($"{queryPart}*")
.Fields(fs => fs
.Field(f => f.IdText,10)
.Field(f => f.LastName,9)
.Field(f => f.FirstName,8)
.Field(f => f.Email)
.Field(f => f.Title)
.Field(f => f.Address)
)
)
)
)
);
which generates the following query:
"query": {
"bool": {
"must": [
{
"match": {
"active": {
"query": "true"
}
}
},
{
"bool": {
"should": [
{
"multi_match": {
"fields": [
"idText^10",
"lastName^9",
"firstName^8",
"email",
"title",
"address",
],
"operator": "and",
"query": "smit*"
}
}
]
}
}
]
}
},
"size": 30
}
which as you can see did not add the second 'Should' argument/sub-query at all.
Can anyone advise what I have done wrong or if what I need to do is even possible?
Thanks,
Danny.