Muliple nested 'Should' sub-queries using DSL

daniel.butler · April 11, 2022, 11:08am

Hi,

I have a query which works as I intended however when trying to replicate the query in DSL I have been unable to get the generated query to match and so the results are obviously incorrect.

What I need to achieve is:

As seen in the code below weights are added to the relevant fields
A single search box in the UI allows searching against any field (i.e a global search across 14 fields)
smit dan -> Should match 'Dan Smith, Danny Smithe, Daniel Smithson'
536 (as a string) -> should match '536, 5367, 53678'
Exact matches are scored highest followed by closest match
The 'Must: Match' sub queries MUST match exactly
The 'Must: Should' queries SHOULD match but they BOTH should match not just 1

I tried:

Phrase_Prefix: Did not work well for searches such as 536, an exact match 536 was scored low and ANY partial match was scored much higher
Fuzzy: Was not accurate enough for my use case, 'Smit' would return 'Smil, smile, smol' for example before 'Smith' even with 'transpositions' off and a low 'fuzziness'.

I build the query up dynamically (addtional 'filters' can be added/removed on the UI i.e specificy some or all categories to search in) into a VAR like this:

var mustMatchFilters = new List<Func<QueryContainerDescriptor<Doc>,QueryContainer>>();

The query is then executed like this:

	var searchResponse = await elasticClient.SearchAsync<PatientDoc>(s => s
		.Query(q => q
			.Bool(b => b
				.Must(mustMatchFilters)
			)
		)
		 .Size(request.ResultsLimit)
	);

I'm trying to create a query using DSL to match this query:

	"query": {
		"bool": {
			"must": [
				{
					"match": {
						"active": {
							"query": "true"
						}
					}
				},
				{
					"bool": {
						"should": [
							{
								"query_string": {
									"analyze_wildcard": true,
									"default_operator": "and",
									"fields": [
										"idText^10",
										"lastName^9",
										"firstName^8",
										"email",
										"title",
										"address"
									],
									"query": "smit*"
								}
							},
							{
								"multi_match": {
									"fields": [
										"idText^10",
										"lastName^9",
										"firstName^8",
										"email",
										"title",
										"address"
									],
									"operator": "and",
									"query": "smit*"
								}
							}
						]
					}
				}
			]
		}
	},
	"size": 30
}

I only use 'multi_match' and 'query_string' together as I need to use the wildcard and without both queries the scoring is all exactly the same so an additional sub-query was added to counteract that.

I have the following in DSL:

	shouldMatchFilters.Add(f => f
		.Bool(b => b
			.Should(s => s
			        .QueryString(q => q
				     .Query($"{queryPart}*")
				     .Fields(fs => fs
				          .Field(f => f.IdText,10)
					  .Field(f => f.LastName,9)
					  .Field(f => f.FirstName,8)
					  .Field(f => f.Email)
					  .Field(f => f.Title)
					  .Field(f => f.Address)
				  )
				  .DefaultOperator(Operator.And)
				  .AnalyzeWildcard(true)
				)
			)
		)
	);

	shouldMatchFilters.Add(f => f
		.Bool(b => b
			.Should(s => s
		             .MultiMatch(m => m
				     .Operator(Operator.And)
				     .Query($"{queryPart}*")
				     .Fields(fs => fs
				          .Field(f => f.IdText,10)
				          .Field(f => f.LastName,9)
					  .Field(f => f.FirstName,8)
					  .Field(f => f.Email)
					  .Field(f => f.Title)
					  .Field(f => f.Address)
				    )
                            )
		     )
	     )
	);

This generates the query as below:


    "query": {

        "bool": {

            "must": [

                {

                    "match": {

                        "active": {

                            "query": "true"

                        }

                    }

                },

                {

                    "bool": {

                        "should": [

                            {

                                "query_string": {

                                    "analyze_wildcard": true,

                                    "default_operator": "and",

                                    "fields": [

                                        "idText^10",

                                        "lastName^9",

                                        "firstName^8",

                                        "email",

                                        "title",

                                        "address"

                                    ],

                                    "query": "smit*"

                                }

                            }

                        ]

                    }

                },

                {

                    "bool": {

                        "should": [

                            {

                                "multi_match": {

                                    "fields": [

                                        "idText^10",

                                        "lastName^9",

                                        "firstName^8",

                                        "email",

                                        "title",

                                        "address"

                                    ],

                                    "operator": "and",

                                    "query": "smit*"

                                }

                            }

                        ]

                    }

                }

            ]

        }

    },

    "size": 30

}

The 'Should' queries are now seperate i.e Q1: 'bool => should' Q2: 'bool => should' instead of Q1: 'bool => [should && should]

I tried correcting this by adding them together in DSL as below:

			.Bool(b => b
				.Should(s => s
					.QueryString(q => q
						.Query($"{queryPart}*")
						.Fields(fs => fs
							.Field(f => f.IdText,10)
							.Field(f => f.LastName,9)
							.Field(f => f.FirstName,8)
							.Field(f => f.Email)
							.Field(f => f.Title)
							.Field(f => f.Address)
						)
						.DefaultOperator(Operator.And)
						.AnalyzeWildcard(true)
					)
				)
				.Should(s => s
					.MultiMatch(m => m
						.Operator(Operator.And)
						.Query($"{queryPart}*")
						.Fields(fs => fs
							.Field(f => f.IdText,10)
							.Field(f => f.LastName,9)
							.Field(f => f.FirstName,8)
							.Field(f => f.Email)
							.Field(f => f.Title)
							.Field(f => f.Address)
						)
					)
				)
			)
			);

which generates the following query:

	"query": {
		"bool": {
			"must": [
				{
					"match": {
						"active": {
							"query": "true"
						}
					}
				},
				{
					"bool": {
						"should": [
							{
								"multi_match": {
									"fields": [
										"idText^10",
										"lastName^9",
										"firstName^8",
										"email",
										"title",
										"address",
									],
									"operator": "and",
									"query": "smit*"
								}
							}
						]
					}
				}
			]
		}
	},
	"size": 30
}

which as you can see did not add the second 'Should' argument/sub-query at all.

Can anyone advise what I have done wrong or if what I need to do is even possible?

Thanks,
Danny.

system · May 9, 2022, 11:08am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Combine `should` with `filter` search API Elasticsearch language-clients	2	403	August 24, 2021
Exact + partial match of text documents (with bool query?) Elasticsearch	2	1059	March 8, 2020
"Should" is not working in DSL Elasticsearch	8	343	September 8, 2021
DSL compound Queries Elasticsearch	5	254	June 15, 2023
Equivalent of a maximum_should_match? Elasticsearch	4	4224	July 5, 2017

Muliple nested 'Should' sub-queries using DSL

Related topics