Store the data more than 1TB per day

Hi Team,

Can elasticsearch store big data more than 1TB per day?
When the data corrupt, this corrupted data have backup??

Thanks and best regards
Sharon

Yes. I heard about someone indexing 10m docs per second.

You have replicas in elasticsearch.
You can snapshot/restore data.
More than that you can create hourly indices for example and in the worse case you can just drop the corrupted hourly index if it’s ok for your use case.

But the team has been working very hard to reduce that risk of corruption.

After the data corrupted, still can snapshot/restore the data?

It depends on what you mean by corrupted I guess.
The way elasticsearch works is by writing immutable files. If you snapshot every 10 minutes for example you will end up with valid backup.
If for whatever reason one of the new created immutable file gets corrupted (which is unlikely going to happen) you will always being able to restore a previous point of time of your index.

But what is your fear exactly?

is it will dynamically and automatically be discovered? without snapshot ourself

I don’t understand

it like distribution map. Elasticsearch got this function?
Capture

Elastic manages this through the use of primary and replica shards, and distributes these automatically across the cluster. I would recommend that you read this chapter from Elasticsearch: the definitive guide to get a better understanding.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.