- What would be recommended infrastructure for Elastic Search for a workload of 45,000 events per second?
- What would be the recommended compute requirements?
- How many clusters I should have with bare minimum Master and Data Nodes?
- Size of payload is approx 0.5 kb to 1 KB.
Your quick assistance would be appreciated.
Still waiting for your recommendation on my query.
Is the 45,000 a peak or average rate? How long are you going to keep the data once it is indexed? What type of data is it? What do the query patterns and latency requirements look like? What type of hardware are you looking to deploy on?
Just FYI, we have just turned winlogbeat on many domain controllers, it has ignore_old: 72h, so we had an instant backlog of data. We started ingesting at about 10K events/sec to 6 Dell R640's running logstash and they are elastic data nodes on spinning disk. (Budget won over performance). Logstash had 3 ingest pipelines. I changed that to 8 and ingest went up to over 20K, increasing to 12 got to almost 50K/sec. Logstash pipelines will use a CPU per thread when busy. This was on top of all other normal ingest approx 3K/sec. We sustained this rate for a few hours until the 72 hours of old data was ingested.
Our design is for eventual 10K/sec, so I think we'll be able to do that.
Maybe, but maybe you could achieve your target performance with fewer nodes if you were using SSDs. The nightly benchmarks run on a 3-node cluster (with SSDs) and exceed the performance you're seeing here by quite some margin.
The size of the cluster also heavily depend on how long you are keeping the data and what your query requirements are, which you have not yet detailed.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.