Elasticsearch(7.8) NEST(C#) aggregations

document count ~10 billion documents
elastics disk usage ~2TB
elasticsearch 7.8 via NEST/C# api.

my model is a transaction receipt. a simplified schema

[keyword]
string customer_id { get; set; } //cardinality 100M
[keyword]
string employee_id { get; set; } //cardinality ~5M
[keyword]
string store_zip { get; set; } //cardinality 100k
[keyword]
List<string> codes { get; set; }  //can be up to 25 - cardinality ~300k

For this phase I want counts of term aggregations per customer or employee. I believe a composite aggregation can get me this 'pivot' from transactions to customers or employee.

I have one use case that I can not figure out: Filter customers that have made 2 or more purchases of a particular set of codes.

As an example
Filter by customer that have made >=2 purchases with codes { "A1", "A2", A3" }
then
Aggregate by customer by top 4 zip codes

Any guidance would be helpful.

Thank you for your time in advance

Mark harwood has a great presentation about modeling event data as entity data to handle queries that are too expensive during search.

https://www.elastic.co/videos/entity-centric-indexing-mark-harwood

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.