Ec2 instances for AWS ElasticSearch with good write performance and storage

Natalia_Kuleniuk · December 17, 2018, 9:55pm

Hi ElasticSearch community.

I’m using AWS Elasticsearch and need advice on what instance types to choose for my cluster and. How many shard to use. My use-case is the following:

My system produces at max 150,000,000 records per day (e.g = 1736 records per second, each record is ~900 bytes, = 1,5MB per second). My system will publish records to ES in a batch of 5 MB (since 5000/1500 = 3.3 sec, than every 3.3 seconds there will be a batch request to ES with ~6,000 records each).

The number of time this data will be read from ES in Kibana is relatively small (for example, maybe 2-3 times a day). So, for me, the most important part is selecting EC2 instances for my nodes that has a lot of storage and that can perform a lot of write operations.

I need to figure out, how many instances I need, and what type of the instance will fit my needs? Also, I need to understand, how many shards I need while creating index

I run load test for the amount of data I described below, having just 3 m4.large.elasticsearch

Instances, my cluster became RED very quick.

Christian_Dahlqvist · December 18, 2018, 7:36am

The size of the cluster is often driven by a combination of ingest volume and storage needs. As you have not outlined how long you need to keep your data it is hard to provide any sizing guidance. How much data a node can ingest and hold will depend on the data, querying and how well you optimise your indices. Make sure you follow the guidelines around sharing in this blog post and also watch this webinar.

Even though you are primarily indexing into the cluster, you will also need to take querying into account as I assume you still have some performance expectations when you actually do query the data.

A good way to handle log and metrics type data with a long (or reasonably long) retention period is to use a hot/warm architecture if you have a reasonably long retention period. This is available on our hosted Elasticsearch service.

Natalia_Kuleniuk · December 21, 2018, 12:59am

I need to keep the data for long time period (at least a year). I calculated that having 13GB of data per day, will give me 4TB of data per year. I need to make sure that I will have enough storage. As I mentioned, I don't need to run queries on regular basis. Just sometimes. Yes, I want my queries to be quick, but for me right now the most important part is writing to cluster and storing the data for long time, rather then querying it

system · January 18, 2019, 12:59am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Quick question on which Amazon instance type to use Elasticsearch	1	366	July 6, 2017
Recommended shard size and data node instance type in AWS Elasticsearch	2	1026	March 23, 2019
ElasticSearch on Amazon EC2 tips Elasticsearch	4	1570	July 6, 2017
ES in AWS EC2 Elasticsearch	5	605	July 5, 2017
Elasticsearch on EC2. What kind of instance types to use? Elasticsearch	7	17646	July 6, 2017

Ec2 instances for AWS ElasticSearch with good write performance and storage

Related topics