I have a cluster (ES version 7.0.3) with 3 nodes with ubuntu servers and i have not declare any node as a master or data node, by default it elect one master and other data nodes itself. And i have only one index on this cluster with 1 primary shard and 1 replica of this index. on primary shard have 480GB data and on replica shard have 480GB data.
This application is running from last 3 year but data is increasing from 50 GB to 480 GB in last 3 years.
and each server have 8 CPU and 16GB RAM. with setting refresh_interval = 2s and number_of_shards = 1 and number_of_replicas=1
when my application was down due to traffic load then i have upgraded the all three server from 8 Core and 16GB RAM to 16 Core and 32GB RAM. Additionally i have added more traffic on it after upgradation, Now application is working fine.
First Cluster Config Details
Node 1 config
cluster.name: es-cluster-mi
node.name: master-01-mi
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: [_local_,_site_]
http.port: 9200
discovery.seed_hosts: ["17.17.17.1", "17.17.17.2", "17.17.17.3"]
cluster.initial_master_nodes: ["17.17.17.1"]
Node 2 Config
cluster.name: es-cluster-mi
node.name: data-01-mi
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: [_local_,_site_]
http.port: 9200
discovery.seed_hosts: ["17.17.17.1", "17.17.17.2", "17.17.17.3"]
cluster.initial_master_nodes: ["17.17.17.1"]
Node 3 Config
cluster.name: es-cluster-mi
node.name: data-02-mi
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: [_local_,_site_]
http.port: 9200
discovery.seed_hosts: ["17.17.17.1", "17.17.17.2", "17.17.17.3"]
cluster.initial_master_nodes: ["17.17.17.1"]
On other side i have decide to create proper Elasticsearch cluster ( version 7.0.3 ) with 3 master and 3 data nodes.
each master size is 4 CORE and 8 GB RAM and each data node size is 8 CORE and 16GB RAM.
But now when i move the production application traffic on New cluster then application increase the response time and CPU usage hit the 100% on all data node.
And i am handling the all master from the Load balancer and use the nginx on all master server because of the ALB not support the username and password. Below are the config on New Cluster.
I have modified setting of New Es refresh_interval = 40s and number_of_shards = 10 and number_of_replicas=1
New Cluster Config
Master 1
cluster.name: new-es-cluster-mi-01
node.name: new-es-cluster-master-01
node.master: true
node.data: false
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
thread_pool:
search:
size: 15
queue_size: 1500
min_queue_size: 1000
max_queue_size: 2000
discovery.seed_hosts: ["17.18.18.1", "17.18.18.2", "17.18.18.3", "17.18.18.4", "17.18.18.5", "17.18.18.6"]
cluster.initial_master_nodes: ["17.18.18.1", "17.18.18.2", "17.18.18.3"]
Master 2
cluster.name: new-es-cluster-mi-01
node.name: new-es-cluster-master-02
node.master: true
node.data: false
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
thread_pool:
search:
size: 15
queue_size: 1500
min_queue_size: 1000
max_queue_size: 2000
discovery.seed_hosts: ["17.18.18.1", "17.18.18.2", "17.18.18.3", "17.18.18.4", "17.18.18.5", "17.18.18.6"]
cluster.initial_master_nodes: ["17.18.18.1", "17.18.18.2", "17.18.18.3"]
Master 3
cluster.name: new-es-cluster-mi-01
node.name: new-es-cluster-master-03
node.master: true
node.data: false
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
thread_pool:
search:
size: 15
queue_size: 1500
min_queue_size: 1000
max_queue_size: 2000
discovery.seed_hosts: ["17.18.18.1", "17.18.18.2", "17.18.18.3", "17.18.18.4", "17.18.18.5", "17.18.18.6"]
cluster.initial_master_nodes: ["17.18.18.1", "17.18.18.2", "17.18.18.3"]
Data Node 1
cluster.name: new-es-cluster-mi-01
node.name: new-es-cluster-data-01
node.master: false
node.data: true
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["17.18.18.1", "17.18.18.2", "17.18.18.3", "17.18.18.4", "17.18.18.5", "17.18.18.6"]
cluster.initial_master_nodes: ["17.18.18.1", "17.18.18.2", "17.18.18.3"]
thread_pool:
search:
size: 30
queue_size: 3000
min_queue_size: 3000
max_queue_size: 5000
Data Node 2
cluster.name: new-es-cluster-mi-01
node.name: new-es-cluster-data-02
node.master: false
node.data: true
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["17.18.18.1", "17.18.18.2", "17.18.18.3", "17.18.18.4", "17.18.18.5", "17.18.18.6"]
cluster.initial_master_nodes: ["17.18.18.1", "17.18.18.2", "17.18.18.3"]
thread_pool:
search:
size: 30
queue_size: 3000
min_queue_size: 3000
max_queue_size: 5000
data Node 3
cluster.name: new-es-cluster-mi-01
node.name: new-es-cluster-data-03
node.master: false
node.data: true
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
thread_pool:
search:
size: 30
queue_size: 3000
min_queue_size: 3000
max_queue_size: 5000
discovery.seed_hosts: ["17.18.18.1", "17.18.18.2", "17.18.18.3", "17.18.18.4", "17.18.18.5", "17.18.18.6"]
cluster.initial_master_nodes: ["17.18.18.1", "17.18.18.2", "17.18.18.3"]
Please Provide me a solution for high CPU usage.