Are buckets persistent?

ES 6.3

Related to this thread: Are contents of a bucket held in the same shard?

Discussing Bucket Aggregations: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket.html

I thought buckets were persistent, that we store documents into specific buckets (of our definition).

My colleague believes buckets are dynamic, built at query time.

Who is right?

What do you mean by "buckets"?

original post above enhanced to make that clear.

Thanks!

Buckets are computed live. They are not persistent on disk if this is what you meant.

Your colleague is right.

But data structure (doc values) to make this computation fast are stored on disk at index time.

David,

Can you help me reconcile this thread, with what we discussed here:

On that thread, it sounded like I could define a bucket up front (for a given account), and then documents added to that bucket would be held together on a single shard.

Please help me understand all the way.

Thanks!

If you use a routing key, all documents with the same key will go to a single and same shard.

Buckets (that you compute when running an aggregation) are computed live when you run the search query.
If you pass a routing key to the request, then only one shard will be used to run the aggregation.

So if you have a userId in your documents, and use userId as the routing key, then if you want to compute an aggregation for user 7 (only), you can pass routing=7, do a bool filter query with userId=7 and run a terms aggregation on whichever field you want.

Now, what problem are you trying to solve actually?

David,

This is a continuation of the thought started here.

On our earlier conversations here, I had the impression that buckets had a persistence aspect. From reviewing the docs, and your comments on this thread, it appears I am mistaken. It appears that routing + shards is the the only path for me.

I suppose the next question becomes: When I do bucket aggregations, how do I make sure the ES limits itself to a single shard? (Does routing apply to bucket aggregations?)

Thank you!

That does not answer to my question. Why do you want to know that? Is there any reason you don't want to use the default behavior?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.