Are buckets persistent?

Jose_Ramierez · August 10, 2018, 12:13am

ES 6.3

Related to this thread: Are contents of a bucket held in the same shard?

Discussing Bucket Aggregations: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket.html

I thought buckets were persistent, that we store documents into specific buckets (of our definition).

My colleague believes buckets are dynamic, built at query time.

Who is right?

dadoonet · August 10, 2018, 1:09am

What do you mean by "buckets"?

Jose_Ramierez · August 10, 2018, 1:26am

original post above enhanced to make that clear.

Thanks!

dadoonet · August 10, 2018, 1:38am

Buckets are computed live. They are not persistent on disk if this is what you meant.

Your colleague is right.

But data structure (doc values) to make this computation fast are stored on disk at index time.

Jose_Ramierez · August 12, 2018, 9:13pm

David,

Can you help me reconcile this thread, with what we discussed here:

On that thread, it sounded like I could define a bucket up front (for a given account), and then documents added to that bucket would be held together on a single shard.

Please help me understand all the way.

Thanks!

dadoonet · August 12, 2018, 11:48pm

If you use a routing key, all documents with the same key will go to a single and same shard.

Buckets (that you compute when running an aggregation) are computed live when you run the search query.
If you pass a routing key to the request, then only one shard will be used to run the aggregation.

So if you have a userId in your documents, and use userId as the routing key, then if you want to compute an aggregation for user 7 (only), you can pass routing=7, do a bool filter query with userId=7 and run a terms aggregation on whichever field you want.

Now, what problem are you trying to solve actually?

Jose_Ramierez · August 13, 2018, 12:26am

David,

This is a continuation of the thought started here.

On our earlier conversations here, I had the impression that buckets had a persistence aspect. From reviewing the docs, and your comments on this thread, it appears I am mistaken. It appears that routing + shards is the the only path for me.

I suppose the next question becomes: When I do bucket aggregations, how do I make sure the ES limits itself to a single shard? (Does routing apply to bucket aggregations?)

Thank you!

dadoonet · August 13, 2018, 2:47am

That does not answer to my question. Why do you want to know that? Is there any reason you don't want to use the default behavior?

system · September 10, 2018, 2:47am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Are contents of a bucket held in the same shard? Elasticsearch	4	535	September 9, 2018
How are buckets stored within shard? Elasticsearch	1	326	August 26, 2018
Persistency Elasticsearch	4	449	July 6, 2017
Storage or query history. Is it possible to store it temporarily? Elasticsearch	2	312	May 24, 2019
Node Local Storage and Gateway Storage Elasticsearch	2	1112	July 6, 2017

Are buckets persistent?

Related topics