There is an index with fields:
Each record may or may not have an account_id , the same is with fingerprint . It is necessary to group them by account_id along the way adding to these accounts - records with a fingerprint that are present in records from a specific account_id .
The essence of the task is to display a list of unique statistics records for known accounts (those who have an account_id + those who do not have it, but have the same fingerprint as some of the records in this group) and unknown (those who only have a fingerprint) with pagination.
I try this:
'aggs' => [ "items" => [ "composite" => [ "sources" => [ [ 'account_id' => [ 'terms' => [ 'field' => 'account_id' ], ], ], [ 'fingerprint' => [ 'terms' => [ 'field' => 'fingerprint' ], ], ] ], ], "aggs" => [ "hits" => [ "top_hits" => [ "size" => 100 ] ] ] ], ]
And something like this (here I know that the code is not entirely correct, I just found it in history) :
'collapse' => [ 'field' => 'account_id', 'inner_hits' => [ [ 'name' => 'accounts', 'size' => 1, 'sort' => [ [ 'timestamp' => 'desc' ] ] ], [ 'name' => 'fingerprints', 'size' => 1, 'sort' => [ [ 'timestamp' => 'desc' ] ] ] ], "max_concurrent_group_searches" => 4 ],