Help: Flattened aggregations (with limiting and sorting)

Here's a bit of background info:

I'm interested in using aggregations to produce distinct keys for multiple
"term" fields and then getting a "measure" value for those keys. This can
be accomplished by "tree"-ing term aggregations together and whatever
"measure" terms are applied to the lowest sub-aggregation.

For instance:
"aggs":{
"field1Agg": {
"terms": {
"field": "field1"
},
"aggs": {
"field2Agg": {
"terms": {
"field": "field2"
},
"aggs": {
"measure1Agg": { "sum": { "field" : "measure1"} }
}
}
}
}
}

Now, when I get this data back, I just recursively flatten the results into
a single List. I then apply whatever sorting and limiting after the fact.

The bad:
This actually requires me to request for every record from ElasticSearch,
which is not ideal.

So is there a particular way to accomplish the sorting/limiting on
ElasticSearch rather than after I flatten the data? I saw the "top_hits"
aggregation, but I'm not sure how it applies...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8eafbce-3636-4532-847b-409730241a8f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Thoughts, anybody? I saw that you can somewhat do this with "scripts" and
letting the top aggregation encompass all term fields, but is that any more
performant?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c79e42a7-bd82-4aed-9e1d-4e882566f630%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi Matt,

I don't understand what the problem is, can you maybe try to elaborate a
bit?

Thanks.

On Fri, Oct 24, 2014 at 4:00 PM, Matt Traynham skitch920@gmail.com wrote:

Thoughts, anybody? I saw that you can somewhat do this with "scripts" and
letting the top aggregation encompass all term fields, but is that any more
performant?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c79e42a7-bd82-4aed-9e1d-4e882566f630%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c79e42a7-bd82-4aed-9e1d-4e882566f630%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j65OMxJQkCxRBORW9V2%3Dy2wUNxDmbWiWKdJh_27qjsX1g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hey Adrien,

Say I have two fields in my index with values:

genre = {Action, Adventure}
actor = {Tom Cruise, Jason Statham}

I'm looking for a way to get the distinct combinations of values with doc
counts, so I use a sub-aggregation:
"aggs":{
"genreAgg": {
"terms": {
"field": "genre"
},
"aggs": {
"actorAgg": {
"terms": {
"field": "actor"
},
"aggs": {
"measureAgg": { "sum": { "field" : "docCount"} }
}
}
}
}
}

When I get the data back, I flatten it into a CSV format:
Action, Tom Cruise, 50
Adventure, Tom Cruise, 40
Action, Jason Statham, 20
Adventure, Jason Statham, 40

My question is, is there a better way to do this? I'm not entirely worried
about recursively flattening the data. My point of interest is:

  1. Performance - My top aggregation may not be the one with the lowest
    cardinality, can ES handle that for me?
  2. Sorting & Limiting - I have to fetch all the data for these fields. Say
    I want to "sort by actor, limit 1". Where do you apply the sort? It can't
    be on the genre field. Actor's field seems logical, but I still can't
    limit the genre field at all. Fetching all the data and then flattening
    works because I can sort correctly then limit.

I have seen that you can use script fields to return back single rows. But
can you sort and limit by a script field?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/28b210de-9822-4967-a95a-c5f74f426ff9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.