Elasticsearch 5.4, delete_by_query, merge, TB scale


We are planning a xxTB (double digit) cluster of Elasticsearch 5.4.
The data model is ~10K indices, each index ~10GB of data accumulated over a period of one month per index (or ~330MB per day per index), with new data coming in every few minutes.

Each index has parent documents and child documents (parent/child relationship)
Therefore, the documents will be located in the same shard.

We would like to retain 1 month of data, and purge older data, while keeping documents which have children.

We plan to use delete_by_query to mark old documents with no children, then trigger a merge to free the storage.

We understand that the merge process will require as much free storage as the surviving documents which will be copied to new segments before the old ones are deleted.
so if we delete every half a month, we will need additional 50% of the storage for the merge process, if we delete every week, we will need additional ~75% of the storage and so on.

  • Has anyone faced a similar challenge?
  • What was your solution?
  • Are there pitfalls we are missing?
  • Is there a better way to do this?

Would appreciate some advice,

Why not store things in monthly indices?

What kind of data do you have? What is the driver behind using parent-child for time based data?

We need a sliding window of one month of data

The driver behind parent-child is to enable queries which are similar to "join" queries in relational databases.
i.e. there is a relationship between documents where one document represents a parent and his children are related to him.

queries like return all the parent documents which have child documents of "type" A AND "type" B AND "Type C" (not Elasticsearch type rather a logical type of document), and return the the child documents themselves, in a specified range of time.

If I am calculating this correctly, you are looking to index around 3TB of data per day, which means that you also need to delete the same amount of data through delete-by-query per day in order to manage your retention period. That is a lot of indexing and deleting. Given that your total data volume is around 100TB-200TB and that you need to leave headroom for querying in the cluster, you are likely to be looking at a sizeable cluster.

Before discussing potential options, I do have a few additional questions:

  1. How often are parent level documents updated?
  2. What is the expected ratio between parent and children?
  3. How many queries per second are you expecting to need to support? Will these always rages a single index?

Thank you for taking the time to understand our challenges :slight_smile:

  1. Parent level documents are not updated. In fact no document is updated after insertion.
  2. The ratio is roughly 1 to 1000. i.e. for every 1 parent there will be ~1000 children on average.
  3. We should be able to support ~100 queries per second, searches could be on specific indices or all indices depending on the query.

If parents are never updated, you may want to consider flattening and denormalising the model. As parent data would be duplicated across children, it would take up more space, although Elasticsearch should be able to compress it quite well. It would also change the way you query the data, but would make it possible for you to use more traditional time-based indices which would allow you to simply delete documents by deleting indices, which is much more efficient than deleting by query.

I would recommend benchmarking the two options to get a better feel for how they scale and perform.

On average each parent event is 339 bytes
Lets say we copy 200 unique bytes from the parent event to each child event, therefore each event will weigh 539 bytes.

We have ~3,100,000,000 events per day (without the parent events).
3,100,000,000*539 bytes * 30.4 days in a month = ~50 TB
instead of ~21TB of RAW data or x2.38 storage

so storage wise its a bit more than having parent child with delete_by_query with merge,
however it will make the deletion process much simpler and prolong the life of the disks.

however flattened will not allow us to perform the queries we require which rely on the parent child model..
as i wrote in the example, all the parent events which have child events of "type A" AND "type B".
is there a way to perform such a query on flattened events ("join" like)

I would recommend benchmarking it in order to determine exactly how much the overhead is and what compression that is achieved given that you are repeating the same data for multiple records. Basing this design decision on assumptions and estimates is risky. In addition to saving a lot of processing and disk I/O as you in this solution do not need to update records (which is what a delete is behind the scenes), you may also save some disk space that would have been consumed by the deletion process.

A short update,
we understand that merge can be called per index.

Each index is a few GB therefore we will need temp space for per index merge or bulk of indexes merge therefore we will not need x2.38 the entire storage, just a few GB of temp storage.
also we enabled best_compression to lower the storage a bit further.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.