I'm performing some statistical analysis on keywords and have been stumped trying to query for count of occurrences.
Here's a simplified example of what I wish to achieve;
Example Documents
{ "Colour": "Red" }
{ "Colour": "Green" }
{ "Colour": "Blue" }
{ "Colour": "Red" }
{ "Colour": "Red" }
Output Description
Count of documents that have Colour only occuring in 1 document
Count of documents that have Colour only occuring in 2 document
...
Count of documents that have Colour only occuring in 50+ document
Output Example
[
{
"Key": "1",
"Count": 2 // Green and Blue only occur in 1 document
},
{
"Key": "2",
"Count": 0
},
{
"Key": "3",
"Count": 1 // Red occurs in 3 documents
}
]
I would like the result to be constrained by a query, I've considered using a Terms agg with partitions and counting in code but this seems sub-optimal.