Elastic Search > @SearchUI > group with sort

I've been trying for days to get group+sort working with @search-ui, with either the app-search connector or the elasticsearch connector with limited success with both.

Originally I used the app-search connector and set the group on the query:

group: {
      collapse: true,
      field: 'card_code_rarity',
},

This almost gives me the result I want, with each search result containing a _group property with the documents in the group. The only additional thing I need is to sort by the count of these documents. If the following worked it would be fantastic:

group: {
      collapse: true,
      field: 'card_code_rarity',
},
sort: [
    { '_group.length': 'desc' },
],

But it doesnt. After hitting non-step dead-ends I decided to try to Elasticsearch connector with search-ui.

I can get some form of a usable end result with the following in POSTMAN:

aggs: {
            card_code_rarity: {
               terms: {
                  field: "card_code_rarity",
                  order: {
                     "_count": "desc"
                  }
               },
               aggs: {
                  my_docs: {
                     top_hits: {
                        sort: [
                           {
                              "selling_price": { "order": "desc" }
                           }
                        ]
                     }
                  }
               }
            }
         },

However this part of the query gets completely omitted by the Elasticsearch connector, it seems to not have support for this.... and it wont even forward the params as-is.

This connector also doesn't seem to have a beforeSearch hook like the app-search one, and it's not easy to extend it since it doesnt export handleSearchRequest(). So what to do?

Ideally, I can go back to using AppSearch connector and get the sorting to work somehow.

Otherwise how can I get the aggregate query to work with the Elasticsearch connector?

Actually, would be nice to know how to do it with either connector in case I need to switch again for some other limitation.

Looks like I can achieve the query I need using the second param handler to the connector like so:

const connector = new ElasticsearchAPIConnector(
  { ... }, 
  (requestBody, requestState, queryConfig) => {   
    requestBody.aggs = queryConfig.aggs;

    return requestBody;
  }
);

This probably shouldnt be neccesary but I do get the expected results in the Chrome network tab.

However, yet another blocker here is I'm unable to extract the full Elasticsearch response from the connector, and there is nothing I can use in the SearchDriver's state object.

hey @parliament718

You should be able to get the raw elasticsearch response for each result via the _meta.rawHit

Joe

@joemcelroy
Thanks for the reply.

Unfortunately that is not enough. I need the aggregations value at the root of the response. That is a siblings of hits.

I've been digging into the lib source like crazy and it's just so hard to extend because of all the default exports, none of the useful classes are exported.

As such, I can't extend SearchkitResponseTransformer which seems to be what I need. No way I see to provide my own

Even with some hacks to compiled code to get that property from the response, my implementation is broken in respect to pagination as it will just return the same buckets each time.

Is there a way to revert back to AppSearchConnector where it was returning the "_group" on each hit and the pagination was working... that was sooooo close to what I need. All I need is to sort the hits by the count of docs in the _group, like:

group: {
      collapse: true,
      field: 'card_code_rarity',
},
sort: [
    { '_group.length': 'desc' }, <!-- is there something Im missing to make this work
],

I can't believe this is so hard, seems like a common and basic use case.

Yeh elasticsearch-connector wasn't designed to be extensible like that.

Its an API limitation on App Search to allow sorting of the group.

One option could be to fork the elasticsearch-connector and customise.

Alternative is using searchkit directly (www.searchkit.co). That allows customising the full elasticsearch query and response with request hooks Search Relevance – Searchkit

At this point it's not a matter of which connector but whether ES can even do this. After testing all kinds of aggregates my results are as follows:

Terms Aggregate:

  • Allows sort by _count :white_check_mark:
  • Gets actual documents :white_check_mark:
  • No pagination :x:

Composite Aggregate:

  • Allows pagination via "after" :white_check_mark:
  • Doesnt allow sort over _count. :x:
  • Doesnt return actual documents :x:
  • Doesnt work with top_hits :x:

Bucket Sort Aggregate:

  • Allows to sort by _count :white_check_mark:
  • Allows pagination via "from" :white_check_mark:
  • Doesn't return actual documents :warning:
  • Works with top_hits to return actual documents :white_check_mark:

The Bucket Sort Aggregate appears to be giving me the desired results but is there some other limitation I should be aware of? Are these results accurate? Frankly it's shocking the state of capabilities after a decade of development.

Here's what appears to be working for me, a mix of bucket_sort and top_hits:

aggs: {
            my_bucket_sort: {
               terms: {
                  field: 'card_code_rarity',
                  // We can do sort here also
                  size: 1000000  // Keep this size big integer. This keep all possible result in bucket
               },
               aggs: {
                  my_bucket_sort: {
                     bucket_sort: {
                        sort: [
                           {
                              _count: {
                                 order: 'desc'
                              }
                           }
                        ],
                        from: 0, // use this to paginate
                        size: 12  // page size
                     },
                  },
                  //need this to get the actual documents
                  listings: {
                     top_hits: { }
                  }
               }
            }
         },
         size: 0,


Is combining bucket_sort with top_hits like this a supported query by design or are my results coincidental/unreliable?

Could I ask what are you trying to do / the experience you hope to achieve? I made the assumption you're grouping on a field and you want to show inner results. Using collapse functionality could work for you Collapse search results | Elasticsearch Guide [8.6] | Elastic

@joemcelroy
Collapse sounded promising but I'm not getting expected results. If I add:

"collapse": {
    "field": "card_code_rarity"        
  },

All that gives me is the following extra data for each hit in the response:

"fields": {
      "card_code_rarity": [
         "TDGS-EN038-(UtR)"
       ]
  }

I was expecting some sort of bucket with the documents for that rarity. Ok let me try to explain what Im trying to do:

It's a marketplace for cards. There are Card objects and CardListing objects.

When a Card is "listed" for sale, I take the Card object, clone it, add some properties (selling_price, etc), and store it in the same index.

card_code_rarity is unique for a Card but not for a CardListing (since for 3 listings I'm cloning the card with the same card_code_rarity)

I want the user to search all cards and for each result to see how many listings there are, with the cards having the most listings at the top.

Example Results Display:

  • Card A: 4 listings from $20 - $50
  • Card B: 3 listings from $5 - $15
  • Card C: 0 listings

To do this I want to

  • group by card_code_rarity to get the listings per card
  • sort by the _count to get most listed cards at the top
  • all this should paginate, facet, filter, sort, etc normally at the top level

Thanks again for the help

Although I havent fully vetted the results yet, combining bucket_sort + top_hits aggregates seems to give expected results though I already see a limitation that top_hits.size can only be max 100, meaning only 100 listings per card will be retrieved and display info inaccurate if there are more. This is fine for now, but isnt scalable.

I only dispay a handful of cards at a time due to pagination, so it would be better to retrieve all the listings for them.

have you tried using inner_hits option aswell with collapse? You can specify the sorting here. Sorry if you have considered this already

you might be able to do this in the nested fields approach by indexing the card and card listings being children which have the price

@joemcelroy Thank you.

The inner_hits approach seems to give me expected results as far as hierarchy, but I'm still unable to sort by the count of inner_hits documents.

 "collapse": {
            "field": "card_code_rarity",
            "inner_hits": {
               "name": "most_recent",
               "size": 5,
               "sort": [{ "selling_price": "desc" }]
            },
            "max_concurrent_group_searches": 4
         },
         sort: [
              <!--- still need a way to sort at this level based on count of inner_hits
         ],

If that is supported I would be golden. If not, nesting the documents would represent a significant refactoring, although I have an idea of how that might work

In this article, a constant weight: 1 is added to each child document and then summed for the count using "mode": "sum". Is that the best approach if nesting? Could I do something similar with inner_hits (adding a weight property to each listing) and not need nesting ?

Thank you

heres a long running thread about collapse and being able to sort from the collapsed groups Add support for collapse sort to specify how to sort groups relative to each other · Issue #45646 · elastic/elasticsearch · GitHub but hasn't yet made it to a release yet :frowning_face:

I think you will run into pagination problems and how to integrate this into search ui aswell!

I suggest looking at nested. Searchkit supports nested fields (which search-ui doesn't)

I dont know the best strategy to sort for your example but you can sort on nested documents in this example.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.