I have an Elasticsearch cluster as below:
6 data node(hot/warm)
3 data(cold)/master node
Does adding 2 coordinate nodes increase performance? or reduce?
I have an Elasticsearch cluster as below:
6 data node(hot/warm)
3 data(cold)/master node
Does adding 2 coordinate nodes increase performance? or reduce?
It takes a bit of load off the data nodes which can be beneficial. How much impact this has in performance will depend on what is limiting it. If you are currently limited by CPU or heap it may help a bit, but it is unlikely to help much if the performance of your storage is the bottleneck. To be sure you probably need to test.
Isn't it better to use cold nodes instead of adding new coordinate nodes and allocate coordinate's resources to cold nodes? Because there is less load on these nodes!
I noticed somewhere that you said adding a coordinate node in a small cluster would cause slow or extra interaction between nodes.
Is my cluster small?
That depends a lot on the use case and what problem you are trying to solve. If you provide more detail around this we can probably provide better guidance.
I have a cluster with the following specifications:
2 data node : 48GB RAM - 12 vCore - 10TB Disk
3 master node : 2GB Ram - 2 vCore
1 Coordinate Node : 12 GB RAM - 6 vCore
With ًReplica 0, I have about 1 TB of Index daily with a shard number of 40. I keep the index for 25 days.
IOPs Utilization of data node is about 10%(with iosatat -x).
write in index is not problem. my problem is Kibana query is too slow(about 5 user)(direct api is also).
I think my problem is disk ram ratio!
I want to setup new cluster for keeping 45 days index and query response time less than 5 second(for 5 days)(duration 15 days).
The cluster I want to set up for this is with the following specifications:
node type | node action | ram | cpu | disk | Index Days |
---|---|---|---|---|---|
data | hot | 64 | 12 | 2048 | 2 |
data | hot | 64 | 12 | 2048 | 2 |
data | hot | 64 | 12 | 2048 | 2 |
data | warm | 64 | 12 | 4096 | 4 |
data | warm | 64 | 12 | 4096 | 4 |
data | warm | 64 | 12 | 4096 | 4 |
master/data | master/cold | 48 | 10 | 10240 | 10 |
master/data | master/cold | 48 | 10 | 10240 | 10 |
master/data | master/cold | 48 | 10 | 10240 | 10 |
1 - Is it good idea to use cold node and master together?
2 - Adding coordinate node reduce or increase my query speed in new cluster?
3 - Do you have a better suggestion for my new cluster?
Running without a replica shard is risky as any problem in the cluster could result in data loss. It also IMHO makes it more difficult to maintain as rolling upgrades require you first move data off nodes.
I would avoid this as they will hold a lot of data and could be under heavy query load.
I would not expect it to make much difference for this use case but would recommend that you test to make sure.
I would recommend that you run some benchmarks in order to find out how much data and indexing exch type of node can handle while still supporting your query SLA.
If we assume that all indexing is done on the hot nodes and that indices once moved to the next tier are read-only I would recommend first benchmarking the hot nodes. For these I would recommend having a replica enabled as it is hard to completely secure indicves that are constantkly changing using snapshot/restore. This will mean that you need more nodes in this tier. In this very old elastic{ON} talk I talk about a benchmark I ran on cluster. I simply limited the indexing rate while running and measuring the performance of typical queries/dashboards. As I stepped this up I could find a point where any additional indexing meant queries were too slow. That was the indexing rate one of my hot nodes could support. Based on that I could size my hot tier.
You can do the same for warm/cold zones, but instead test query performance as a function of amount of data held on the node. As you have a relatively short overall retention period you may go with just a warm tier with a bit more storage. If you accept slower queries against old data I guess separate warm and cold tiers may make sense though.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.