Elastic sizing

Hi,

Architecture 1:

I had designed an elastic architecture for 1GB daily incoming data with 365 days
retention period.

3 - Master nodes with 8 CPU and 32GB RAM with 50 GB disk space per node.
8 - Data nodes with 8 CPU and 64GB RAM with 7.5 TB disk space per node.
3 - Client nodes with 8 CPU and 32GB RAM with 50 GB disk space per node
No of Primary shards - 1 per index
No of Replica shards - 2 per index
No of index : 10

Architecture 2:

This elastic architecture from other team and they designed for us with same parameters, 1GB daily incoming data with 365 days retention period.

5 nodes which includes both Master and Data nodes with 8 CPU and 64GB RAM with 3 TB disk space per node.

No of Primary shards - 2 per index
No of Replica shards - 2 per index
No of index : 10

I'm really confused , which architecture will be good.
Any suggestion will help me to proceed further.

It sounds like you plan to generate 20 primary shards and 40 replica shards per day. Over 365 days that gives you 21900 shards. That is a lot for a system only ingesting only 1GB of data per day. You can read this blog post about shards and sharding to learn why this is not a good idea.

If you need to maintain 10 separate indices and are not able to consolidate them, I would recommend that you instead use monthly indices, potentially with a single primary shard. This will give you a much more manageable 30 shards per month - 360 total shards over the year.

If we instead look at data volume, you are only indexing 365GB of raw data per year as far as I understand. If we make the simplified assumption this size stays the same when indexed to disk, the 2 replica shards will give you a total, estimated indexed volume of around 1.1TB. I would expect a cluster with 3 master/data nodes to handle this easily, so unless you have factored in a lot of growth, I would say both architectures may be oversized.

Thanks Christian_Dahlqvist. Data which is not going to be 1GB always, sometimes during outage we may receive max 1TB of data, how can I plan for that.

Thanks,
Saravanan

I am not sure I understand your question. Could you please clarify?

Normally we get 1GB data daily but during outage time we will get 1TB of data. During outage we will get huge alert from Data center.

That is a huge difference. Then you will probably need to size for the larger volume and require a larger cluster than I mentioned, but it is hard for me to say how large based on that information. I would recommend running some benchmarks to determine the correct size.

Which architecture should i follow?, Also can you suggest me benchmark tools, which help me to figure out correct size.

If you get 1TB of data in a day, how is this distributed over time? How much lag ingesting this can you tolerate?

Determining this will give you the peak indexing rate that your cluster need to support. You can then benchmark what size cluster you need to sustain that for a period of time, e.g. using Rally. Make sure that you also include realistic levels of querying in your benchmark.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.