Hello,
I'm having the exact same problem.
Have you managed to find a solution?
My thread is here: LINK
https://groups.google.com/forum/?fromgroups#!topic/elasticsearch/Oum03VSBzHQ
Thanks
On Thursday, October 16, 2014 1:57:35 PM UTC+1, Alastair James wrote:
Hi there.
I am trying to create an aggregation that mimics the following SQL query:
SELECT col1, col2, COUNT(*), SUM(metric) FROM table GROUP BY col1, col2
ORDER BY SUM(metric) DESCOn the face of it, I could create an terms aggregation for col1, add a
terms aggregation for col2 inside it, and the metric aggregations inside
that. I could then dynamically build the SQL result like grid and sort it
myself. However this breaks down for large results set, or a paginated
result set of a larger result.The problem is that the ES aggregation system always returns the top N
results for each parent and child bucket. Thus for each value of col1 I
have N values of col2.What I really want is to consider all possible combinations of col1 and
col2 in the same way as SQL does it and return the top N based on some
other metric. E.g. in ES speak, a single aggregation where the keys are
tuples of (col1, col2).I suppose one way would be to use a script terms aggregation to
concatenate each value of col1 and col2, however thats going to be slow.Does anyone else have any ideas?
Ideally there would be a tuple aggregation built in, e.g.:
"my_agg":{
"tuple":{
"fields":["col1","col2"]
}
}Would product keys that are objects like:
{
"col1":"value1",
"col2":"value2"
}Does anyone know if this would be possible to write as a plugin?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/713127be-b89e-42ee-8811-18dd0e31d16a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.