Bigdata storage of data

Hi Team,

Do we have any options to store the data in Bigdata platform in the case of large volume of data per day

What big data platform are you referring to?

Hi Team,

We need to store the data in 2TB per day basis. So we don't think Elasticsearch will be able to suffice the requirement. Is there a way to collaborate the system to any bigdata storage ?

I know of Elasticsearch clusters capable of handle considerably more than 2TB of ingest per day. How long are you going to store the data? How long do you need it searchable in Elasticsearch?

What is the use case?

Hi ,

We need to Keep 3 months of data mainly this comes as batch application logs.
We have ony 500GB internal storage. How can we store the large amount of data? Do we need external disk or any bigdata storage, can you suggest on this ?

Is there any estimate of the hardware needed ? (# of nodes,
vCPUs, RAM & storage per node)

If you are going to query data through Elasticsearch, the data has to reside in Elasticsearch.
If we assume the size of the data stays the same once indexed (it can grow or shrink depending on a number of factors) and that you will have 1 replica, you will need at least 120TB of storage, so 500GB will not go far.

I would recommend looking at the following resources:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

https://www.elastic.co/webinars/using-rally-to-get-your-elasticsearch-cluster-size-right

Hi

We will not get more that 500GB internal storage in our system. Can you suggest how we can store the 2TB data per day for our application.

Do we have any option to integrate Elasticsearch with external data store. If so please suggest the methods

What kind of external storage do you have in mind? SAN?

Can you please suggest on how to store the large amount of data per day and keep it for atleast 3 months, as we don't have much internal storage for our servers.

I can't unless you answer my questions. You will need access to a good amount of storage as I outlined earlier. Do you have access to SAN that you can mount on the nodes?

Yes we have access to SAN. But that is also not suffice our requirement.

You will need to get access storage and hardware somehow to make this work. Exactly how you do that does however not seem to be something we can help you with here. Have you considered hosting in the cloud?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.