Elastic stack hardware requirements

Hi

I want to set up a cluster and I have these scenarios:

  • I'm using ES, Kibana, filebeat(for logs) [basic license], a custom project instead of logstash
  • Index heavy data, Not very frequent querying
  • Monthly index with about 8GB data and 30M documents per month.
  • Availability is not a priority, but (naturally) I can't afford any data loss.
  • Indices are in the hot phase for one month, warm phase 6 months, and after that indices should stay in the cold phase for two years.
    currently I setup my cluster in VM with a single node.
    120 GB hard disk, 8 GB RAM, and 8 core CPU, and it's performance is reasonable but I feel like it's not a good setup.
    I saw the webinar on sizing and capacity planning,
    but I'm still can't figure out what is the best setup for my scenario?

Any answer other than "it depends" would be appreciated :slightly_smiling_face:

It depends. Oh, you wanted more than that ...

Your total data size is 8GB x 24 months? So ~200GB, which is very small for ES, almost small enough to run on your phone :wink:

Why not a small three node cluster with all nodes having all roles, 8GB RAM and 4GB Heap, shard=1, replica=1 and see how it goes?

But you need more disk, as data size will be an issue at 2xData (primary/replica) is 400GB (hard to know with compression & overhead) on servers with 360G total disk (3x120G) - but if you put 250-500G on each it'd be a nice little cluster.

To save money, you could have two nodes, but you need a 3rd voting node anyway in the latest ES version, so better to go with all three, for more flexibility, easier to grow, more query power, etc.

A final option if you really want to save money on a small system is to have a single node but snapshot / back it up often, like hourly, certainly daily - this has obvious drawbacks, data loss possibilities for ingest & recent data, lower availability, painful rebuild recoveries, but for such a small rarely-used system it's always an option; not terribly recommended :wink:

Thanks for your reply. :pray:
If I set up a three-node cluster, Should I install the Kibana and Logstash-like tool on all of them? or I set one of these nodes as a coordinator and Install Kibana on that node.
What is your suggestion about CPU for these three nodes?

No, I'd start with logstash on one of them, doesn't really matter if the volume is low, though of course watch your CPU/RAM use and see how it goes - on a separate VM is better, but at 8GB/month spread out a bit it's a small amount of data and load.

Put Kibana on another node, see how it goes. No coordinator on a cluster this small. Keep it simple.

CPU can be nearly anything; suggest 4 cores to start and monitor.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.