After the group, the results were inconsistent

In low precision, I want to remove the fuzzy matching, so I will use search_explain and elasticsearch/_search to complete the group operation.

I found a problem:

Set params:

query: 'tshirt',
group: { 'field': 'companyId', 'size': 5 },
filter: {'all': ['companyId': ['68036', '67900', '67758', '67735', '68639', '68530', '40834', '67665', '68501', '68479', '68910', '68333', '67757', '67840', '68920', '68460', '68832', '68534', '68615', '68361', '67868', '68387', '67819', '68351', '68538', '68256', '67745', '68317', '68293', '68292', '68343', '68518', '67718', '68694', '68162', '68816']]}
page: {'size': 5}

got results:

then, I change

.....
page: {'size': 36}

got results:

Finally, when the size is different, the two results are inconsistent, Why is this?

What were the results of using the search_explain API for both queries?

@Sean_Story

{
  "query": {
    "bool": {
      "must": {
        "function_score": {
          "boost_mode": "sum",
          "score_mode": "sum",
          "query": {
            "bool": {
              "must": [
                {
                  "bool": {
                    "should": [
                      {
                        "multi_match": {
                          "query": "tshirt",
                          "minimum_should_match": "1<-1 3<49%",
                          "type": "cross_fields",
                          "fields": [
                            ......
                          ]
                        }
                      }
                    ]
                  }
                }
              ]
            }
          },
          "functions": [
            {
              "script_score": {
                "script": {
                  "source": "Math.max(_score * ((0.2 * Math.max(0.0001, Math.log(Math.max(0.0001, (doc.containsKey(\"rank.float\") && !doc[\"rank.float\"].empty ? doc[\"rank.float\"].value + 1 : 1)))))) - _score, 0)"
                }
              }
            }
          ]
        }
      },
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "company_id.enum": [
                  "68036",
                  "67900",
                  "67758",
                  "67735",
                  "68639",
                  "68530",
                  "40834",
                  "67665",
                  "68501",
                  "68479",
                  "68910",
                  "68333",
                  "67757",
                  "67840",
                  "68920",
                  "68460",
                  "68832",
                  "68534",
                  "68615",
                  "68361",
                  "67868",
                  "68387",
                  "67819",
                  "68351",
                  "68538",
                  "68256",
                  "67745",
                  "68317",
                  "68293",
                  "68292",
                  "68343",
                  "68518",
                  "67718",
                  "68694",
                  "68162",
                  "68816"
                ]
              }
            }
          ]
        }
      }
    }
  },
  "sort": [
    {
      "_score": "desc"
    },
    {
      "_doc": "desc"
    }
  ],
  "highlight": {
    "fragment_size": 300,
    "type": "plain",
    "number_of_fragments": 1,
    "order": "score",
    "encoder": "html",
    "require_field_match": false,
    "fields": {}
  },
  "size": 0,
  "from": 0,
  "timeout": "30000ms",
  "_source": [
    "id"
  ],
  "aggs": {
    "group": {
      "terms": {
        "field": "company_id.enum",
        "size": 5,
        "order": {
          "top_hit": "desc"
        }
      },
      "aggs": {
        "group_hits": {
          "top_hits": {
            "size": 5,
            "sort": [
              {
                "_score": "desc"
              }
            ],
            "highlight": {
              "fragment_size": 300,
              "type": "plain",
              "number_of_fragments": 1,
              "order": "score",
              "encoder": "html",
              "require_field_match": false,
              "fields": {}
            }
          }
        },
        "top_hit": {
          "max": {
            "script": {
              "source": "_score"
            }
          }
        }
      }
    },
    "estimated_total_groups": {
      "cardinality": {
        "field": "company_id.enum"
      }
    }
  }
}

and

{
  "query": {
    "bool": {
      "must": {
        "function_score": {
          "boost_mode": "sum",
          "score_mode": "sum",
          "query": {
            "bool": {
              "must": [
                {
                  "bool": {
                    "should": [
                      {
                        "multi_match": {
                          "query": "tshirt",
                          "minimum_should_match": "1<-1 3<49%",
                          "type": "cross_fields",
                          "fields": [
                            ......
                          ]
                        }
                      }
                    ]
                  }
                }
              ]
            }
          },
          "functions": [
            {
              "script_score": {
                "script": {
                  "source": "Math.max(_score * ((0.2 * Math.max(0.0001, Math.log(Math.max(0.0001, (doc.containsKey(\"rank.float\") && !doc[\"rank.float\"].empty ? doc[\"rank.float\"].value + 1 : 1)))))) - _score, 0)"
                }
              }
            }
          ]
        }
      },
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "company_id.enum": [
                  "68036",
                  "67900",
                  "67758",
                  "67735",
                  "68639",
                  "68530",
                  "40834",
                  "67665",
                  "68501",
                  "68479",
                  "68910",
                  "68333",
                  "67757",
                  "67840",
                  "68920",
                  "68460",
                  "68832",
                  "68534",
                  "68615",
                  "68361",
                  "67868",
                  "68387",
                  "67819",
                  "68351",
                  "68538",
                  "68256",
                  "67745",
                  "68317",
                  "68293",
                  "68292",
                  "68343",
                  "68518",
                  "67718",
                  "68694",
                  "68162",
                  "68816"
                ]
              }
            }
          ]
        }
      }
    }
  },
  "sort": [
    {
      "_score": "desc"
    },
    {
      "_doc": "desc"
    }
  ],
  "highlight": {
    "fragment_size": 300,
    "type": "plain",
    "number_of_fragments": 1,
    "order": "score",
    "encoder": "html",
    "require_field_match": false,
    "fields": {}
  },
  "size": 0,
  "from": 0,
  "timeout": "30000ms",
  "_source": [
    "id"
  ],
  "aggs": {
    "group": {
      "terms": {
        "field": "company_id.enum",
        "size": 36,
        "order": {
          "top_hit": "desc"
        }
      },
      "aggs": {
        "group_hits": {
          "top_hits": {
            "size": 5,
            "sort": [
              {
                "_score": "desc"
              }
            ],
            "highlight": {
              "fragment_size": 300,
              "type": "plain",
              "number_of_fragments": 1,
              "order": "score",
              "encoder": "html",
              "require_field_match": false,
              "fields": {}
            }
          }
        },
        "top_hit": {
          "max": {
            "script": {
              "source": "_score"
            }
          }
        }
      }
    },
    "estimated_total_groups": {
      "cardinality": {
        "field": "company_id.enum"
      }
    }
  }
}

Hey there @ZE_Share thanks for reaching out!

I think that this is working correctly based on the grouping configuration. As you can see in the search_explain API that @Sean_Story suggested, grouping in app search is doing a terms aggregation of size 36 with a nested top_hits subaggregation of size 5. This means that for each of the 36 terms we allow up to 5 top hits.

Are you trying to paginate these grouped results? If that's the case I think you want to add "collapse": true to your group. Note that this flag is experimental, and will only work if the grouped fields don't have multiple values - but that's the way to accomplish pagination using groups at this time with app search.

Hope that helps!

@Kathleen_DeRusso Thank you for your reply. In fact, I am more curious about the different settings of page.size on the outer layer. Why does it affect nested top_hits?

It should affect the number of results that are returned, not the number of results returned in an individual top hits result. I'm not really sure what you're seeing that's anomalous here, I think we'd need to see actual queries and results to be more helpful.

Look at the question I sent again, the items in the red box are the result of top-hits (I converted the format);

The same query conditions, but page.size is different, but the results of top-hits are inconsistent; using 68036 as an example, when size=36, the number of top_hits is returned as 4, but when size=5, the number of top_hits is 2.

Right, I'm not sure why you're seeing that, so if you could trim your example down to something that we could reproduce, it would be easier to help answer you. Thank you!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.