Store the data more than 1TB per day

(雪中的凤凰) #1

Hi Team,

Can elasticsearch store big data more than 1TB per day?
When the data corrupt, this corrupted data have backup??

Thanks and best regards

(David Pilato) #2

Yes. I heard about someone indexing 10m docs per second.

You have replicas in elasticsearch.
You can snapshot/restore data.
More than that you can create hourly indices for example and in the worse case you can just drop the corrupted hourly index if it’s ok for your use case.

But the team has been working very hard to reduce that risk of corruption.

(雪中的凤凰) #3

After the data corrupted, still can snapshot/restore the data?

(David Pilato) #4

It depends on what you mean by corrupted I guess.
The way elasticsearch works is by writing immutable files. If you snapshot every 10 minutes for example you will end up with valid backup.
If for whatever reason one of the new created immutable file gets corrupted (which is unlikely going to happen) you will always being able to restore a previous point of time of your index.

But what is your fear exactly?

(雪中的凤凰) #5

is it will dynamically and automatically be discovered? without snapshot ourself

(David Pilato) #6

I don’t understand

(雪中的凤凰) #7

it like distribution map. Elasticsearch got this function?

(Christian Dahlqvist) #8

Elastic manages this through the use of primary and replica shards, and distributes these automatically across the cluster. I would recommend that you read this chapter from Elasticsearch: the definitive guide to get a better understanding.

(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.