What Hardware do I need for 100 GB Per day data volume?


(Vikas Gopal) #1

I am setting up a cluster and i have following scenario

  1. Data Volume 100GB per day (with 3 month retention )
  2. Search scenarios (dashboards only mix, of analyze and not analyzed fields max 10 users)
  3. I am going to use 8 servers

3 master eligible nodes (6 GB RAM , 2 Core CPU, 50 GB Hard Disk each each)
2 Data Nodes (64 GB RAM , 8 Core CPU, 10 TB (total to retain data for 3 months) )
1 coordinate node (6 GB RAM , 2 Core CPU, 60GB Hard Drive)
1 LS machine (8GB RAM, 2 CPU, 50 GB hard drive)
1 Kibana (8GB RAM, 2 CPU, 50 GB hard drive)

Is this compute good for my scenario ?

Regards
VG


#2

Hi,

your hw pool is much different from mine but here are some thought...

Not sure if you really need the dedicated coordinating node but maybe there is some benefit from having one...

As for the data nodes, that is a lot of storage for one node compared to what I have. I have about 7TB of data and total heap usage is 160GB. I have 20 nodes with about 20GB heap each. That does not mean that what you have will not work. More nodes and more indices/shards increase heap usage and I have a fair bit of indices and shards.

If high availability is not an issue then one LS machine might be enough. How much work does LS do, do you use a lot of filters? I guess that is something for the LS board...

Theoretically you waste the least amount of resources by having as few nodes and indices/shards as possible, so maybe with a good sharding strategy it will work well. Also depends on how complex the documents and queries are.

Just some thought :slight_smile:


(Vikas Gopal) #3

Thanks for the reply , so you mean to say I need more data nodes for this volume ? Sorry for the confusion here 10 TB is the total disk i.e 5 TB each node.
Now Idea behind to have separate coordinate node is to hit Kibana to this node only as this is the recommendation form Elasticsearch.

Regards
VG


#4

Hi VG,

your setup is so different from mine so I don't really know if you need more data nodes or not... I guess you will run ES with the max recommended heap of ~32GB? That might be enough for your use case. That will leave the rest for Lucene and os, etc. Someone with a more similar setup can give you better advice on that.

-AB


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.