Nested multi_match

multi_match query: how it works with nested type?
With type object it works, but document-material is 1:n relations, so i cannot use type object
"MATERIAL": { "type": "nested", "properties": { "NAME": { "type": "string", "include_in_all": true, "fields": { "raw": { "type": "string", "index": "not_analyzed" }, "filter": { "type": "string", "index": "not_analyzed" } } }, "COLOR": { "type": "string", "include_in_all": true, "fields": { "raw": { "type": "string", "index": "not_analyzed" }, "filter": { "type": "string", "index": "not_analyzed" } } }, "PROBE": { "type": "string", "include_in_all": true, "fields": { "raw": { "type": "string", "index": "not_analyzed" }, "filter": { "type": "string", "index": "not_analyzed" } } }, "WEIGHT": { "type": "double", "include_in_all": true } } },

With query
'query' => [ 'multi_match' => [ "type" => "most_fields", "query" => $searchString, "fields" => [ "MATERIAL.NAME^2", "MATERIAL.COLOR" ] ] ],

I think I see what you are asking, but I would recommend that you try to specify what problem you are experiencing more clearly in the future, so it is easier to provide an answer. An example would be: I have this mapping, I index this document, I send this query and I get some unexpected result.

If material is a nested object, it means that each material will go to a different lucene document. Then in order to query those material.name, material.color you need to wrap the multi_match query into a nested query. That way the nested documents will be queried rather than the top-level one.

Thanks. next time will provide full structure/query

Ok. Search is still incorrect
Mapping https://gist.github.com/kostromich/8e4140d3057cb18c2e200078259adcfc
Search query https://gist.github.com/kostromich/b4dabe9b9aba6ee325e52035d76f3dab
Result https://gist.github.com/kostromich/4931a496183de5ea345bf18520d0d49d

There is 17000+ documents in ES. I have 300+ Rings with red round zirconia.
Why i see earrings in first 10 results? What's wrong with query?

Giving good results to bad queries is a tricky business.

I don't mean to say your Query DSL is bad but what the original user provides to you is plain text rather than a structured set of choices (color:red, department:rings etc).

When we have unstructured queries there are a default set of ranking heuristics that can apply to searches:

  1. The more words that match the better
  2. Rare words are better than common words
  3. Short documents are better than long ones
  4. Docs that repeat words are better than those that have them only once.

Your "explain" output does not detail the matches on the nested documents but I imagine rule #2 is applying most here in that the word name:Zirconia and type:Zirconia are seen as very rare and therefore interesting. The words color:red and category:rings are quite common and therefore seen as boring (almost as boring as text:with). Another issue with the rareness heuristic is that multi-field searches can favour precisely the wrong field - red is seen as more interesting when found in any field other than color:red where it is commonplace. The "cross_fields" mode of multi-match attempts to overcome this wrong-field bias but it uses subtle per-word+field scoring tweaks that are undone if you choose to apply those global boosts at a field level.

Ultimately you know more about user-intent than a default set of ranking heuristics will ever do. You could consider pre-processing the query to identify structure where it is missing and turn a bad query which is just a bag of words into a more structured one. e.g. spotting the colors or departments in the text and pulling those out into more structured clauses or "did you mean?" suggestions. The Percolate api may be a good way of implementing this pre-processor with "things-to-look-for in user queries".

1 Like