I need to optimize my elasticsearch cluster to have better performance on all kind of queries.
The current configuration of the cluster is:
Elastic 6.4.0
3 nodes (master, data, ingest)
- 4 core
- 16 GB ram (heap 8GB)
- 1 EBS disk of 1TB
~ 200 indexes
~ 4200 shard (12 shard per index, replica 1 (primary + its copy))
~ 1.5 billion of documents
~ 1.5 TB of data stored
more details about the indexes:
- 1 index of 100GB
- 80 indexes of 2GB each-one
- 80 indexes of 7GB each-one
Continues bulk update queries on the index of 100GB (refresh interval 2s).
Insert on the other indexes (about 10 inserts/second with refresh interval 60s)
Search query on all the indexes with a lot of queries using the index_date_* pattern and that has to scan a lot of indexes (80).
The biggest problem that I have with this configuration is an high load of master server when a query that has to scan 80 indexes is launched. The singular query takes about 2 seconds but a lots of this query are launched in parallel and the execution time increase (until 20 seconds and more) with the master server with a very high load.
I was thinking to a new clustes configuration like this:
3 nodes (data, ingest) (AWS EC2 i3.xlarge instances)
- 4 cores
- 30 GB ram (heap 24GB)
- Local storage NVMe
3 nodes (master) (AWS EC2 i3.large instances)
- 2 core
- 15GB ram (heap 9GB)
- Local storage NVMe
I know that the only way to validate the configuration is a test the configuration on the field but according to your experiance, it could be a good starting point to manage my data?