Retrieving specific amount of nested documents


#1

Say I have an index with mapping:
PUT /item

 {
    "mappings": {
        "_doc" : {
            "properties" : {
				"name": { "type" : "keyword" },
                "supplierName": { "type" : "keyword" },				
                "comments" : { 
                    "type" : "nested",
                    "properties" : {
                        "username" : { "type" : "keyword" },
                        "comment" : { "type" : "text" }
                    }
                }
            }
        }
    }
 }

and I want to retrieve a specific amount of comments that are from a specific supplier and where the comment is made by a specific user.
A user can comment on an item as many times as they want and a supplier can have many items.
Example:
PUT /item/_doc/1?refresh

{
     "name":"ItemOne",
     "supplierName":"CoolSupplier",
     "comments": [
         
         {"username": "mark", "comment": "Cool item1"},
         {"username": "mark", "comment": "Cool item2"},
         {"username": "mark", "comment": "Cool item3"},
         {"username": "mark", "comment": "Cool item4"},
         {"username": "mark", "comment": "Cool item5"},
         {"username": "mark", "comment": "Cool item6"},
         {"username": "jake", "comment": "Bad item"},
         {"username": "paul", "comment": "Great item"}
     ]
}

So say I want to retrieve a certain amount of comments with the name of the item for a specific supplier and user regardless if all the comments are on a single item or spread across multiple.
If I use nested inner_hits like this:
GET item/_search

{
  "size": 4,
  "_source": "name",
  "query": {
    "bool": {
      "filter": [
        {
          "match": {
            "supplierName": "CoolSupplier"
          }
        },
        {
          "nested": {
            "path": "comments",
            "query": {
              "match": {
                "comments.username": "mark"
              }
            },
            "inner_hits": {
              "size": 4
            }
          }
        }
      ]
    }
  }
}

With this query up to four comments can be returned per parent document.
The thing is I only want 4 nested documents in total. Those four nested documents could come from the first found parent document, or one document from four different parent documents.
Is there a way to specify a total amount/maximum number of inner_hits to return regardless of parent doc?

Another alternative I found is the top_hits metric aggregation:
GET item/_search

{
  "size": 0,
  "aggs": {
    "outerFilter": {
      "filter": {
        "match": {
          "supplierName": "CoolSupplier"
        }
      },
      "aggs": {
        "commentAggs": {
          "nested": {
            "path": "comments"
          },
          "aggs": {
            "commentsFilter": {
              "filter": {
                "match": {
                  "comments.username": "mark"
                }
              },
              "aggs": {
                "foundComments": {
                  "top_hits": {
                    "size": 4
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

This correctly returns a maximum of 4 comments regardless of parent docs. The only problem is I wish to also retrieve information from the parent document. Is there a way to do this?
If I wanted to retrieve say the name of the item the comment was made on, how would I do that? Would i need to perform the aggs query, then perform another query with all of the item ids in order to retrieve the name?
Or that if the parent document had a createDate timestamp and I wanted to order by that, is that possible using top_hits aggs? I haven't been able to figure it out.

Should the mapping maybe be a join datatype instead of nested?


#2

No one?