We have an index that has over 100 million data. it has a structure like this:
{
// ...
"first_name": "John",
"last_name": "Doe",
"company": "X Company",
// ...
}
We want to return results let's say 3 items per company eventhough there are much more records than 3.
Here is the records:
first_name, last_name, company
John, Doe, X Company
Jane, Doe, X Company
George, Doe, X Company
William, Doe, X Company
Jack, Doe, X Company
Ellen, Doe, Y Company
Harper, Doe, Z Company
Mason, Doe, Z Company
Ella, Doe, Z Company
Scarlett, Doe, Z Company
Here is the expected query result:
first_name, last_name, company
John, Doe, X Company
Jane, Doe, X Company
George, Doe, X Company
Ellen, Doe, Y Company
Harper, Doe, Z Company
Mason, Doe, Z Company
Ella, Doe, Z Company
How can we write this kind of query? We don't want to use terms aggregation with a top hits sub-aggregation since we have 100 million records. Or do you have any idea for a performant aggregation for this kind of huge data?