Adding a new node to ES

brandonmcgrath1 · July 29, 2016, 10:24am

Hey guys,
I have a lot of servers throwing data to ES and I need to improve its performance. From what I understand, I need to add a new data node?
The _cluster/health active shards is slowly dropping and there are 80 + unassigned shards now. Its only a matter of time until it dies.
So am I correct to add a new data node to fix this? here is my cluster health:
{
"cluster_name" : "eTech_cluster",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 5161,
"active_shards" : 5161,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 91,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 98.26732673267327
}

magnusbaeck · July 29, 2016, 11:51am

Adding another data node is a good idea as a temporary relief but even with two nodes you have way way way too many shards. I suspect you can reduce the total number of shards by combining indexes and/or reducing the number of shards per index but without more information it's impossible to give specific advice. It's possible that you after such an operation will do fine with a single node again.

brandonmcgrath1 · July 29, 2016, 1:01pm

We have roughly 200 servers that will be sending winlogbeat logs to ES. What info would you need to get better advice? Also, is ES capable of support 200 windows servers sending logs?

magnusbaeck · July 29, 2016, 1:51pm

The number of servers sending logs is irrelevant. What matters is the total number of events, how they're distributed over time, and if there are reasons to use separate index series.

I'm interested in what kind of indexes you have and how come you have over 5000 shards. Do you have a single index series (e.g. logstash-YYYY.MM.DD) or multiple? How many shards per index? What's the typical size (in bytes) of an index? How many days do you keep indexes?

brandonmcgrath1 · July 29, 2016, 2:07pm

http://192.168.60.90:9200/_cat/indices?v
Produces hundreds/possibly thousands of these:

health status index green open winlogbeat-2014.10.29 5 0 green open winlogbeat-2014.10.26 5 0 green open winlogbeat-2014.10.25 5 0 green open winlogbeat-2014.10.28 5 0 green open winlogbeat-2014.10.27 5 0 green open winlogbeat-2014.10.22 5 0 green open winlogbeat-2014.10.21 5 0 green open winlogbeat-2014.10.24 5 0 green open winlogbeat-2014.10.23 5 0 green open winlogbeat-2014.10.20 5 0 green open winlogbeat-2014.09.30 5 0 green open winlogbeat-2015.04.09 5 0 green open winlogbeat-2015.04.08 5 0 green open winlogbeat-2015.04.07 5 0 green open winlogbeat-2015.04.06 5 0 green open winlogbeat-2014.10.19 5 0 green open winlogbeat-2015.04.05 5 0 green open winlogbeat-2014.10.18 5 0 green open winlogbeat-2015.04.04 5 0 green open winlogbeat-2015.04.03 5 0 green open winlogbeat-2014.10.15 5 0 green open winlogbeat-2014.10.14 5 0 green open winlogbeat-2014.10.17 5 0 pri rep docs.count docs.deleted store.size pri.store.size
34 0 79.4kb 79.4kb
57 0 98.4kb 98.4kb
265 0 381.5kb 381.5kb
38 0 103.4kb 103.4kb
52 0 127.9kb 127.9kb
42 0 88.4kb 88.4kb
34 0 100.2kb 100.2kb
74 0 160.5kb 160.5kb
46 0 124.5kb 124.5kb
37 0 80.4kb 80.4kb
45 0 114kb 114kb
356 0 390kb 390kb
476 0 264.5kb 264.5kb
334 0 284.5kb 284.5kb
195 0 264.6kb 264.6kb
35 0 108.3kb 108.3kb
194 0 293.3kb 293.3kb
38 0 90.8kb 90.8kb
216 0 297.9kb 297.9kb
202 0 256.6kb 256.6kb
75 0 153.3kb 153.3kb
38 0 87.8kb 87.8kb
56 0 134.2kb 134.2kb

Is this what you was asking?

magnusbaeck · July 29, 2016, 3:25pm

Okay, so you have at least one daily index series (winlogbeat-YYYY.MM.DD) with five shards and you've been going at it for a couple of years. Do you have more index series than winlogbeat-xxx? Two years of winlogbeat-xxx indexes with five shards a day still doesn't add up to more than 3650 shards.

The first and most important step would be to cut down the number of shards per index to one. Regardless of whether you're sending directly to ES from Winlogbeat or if you're using Logstash the key is to modify the index template used. I also suggest that you reduce the number of indexes by using monthly indexes instead of daily, but I'm not sure that's configurable if Winlogbeat sends directly to ES.

brandonmcgrath1 · August 3, 2016, 12:57pm

Sorry for the long reply, I've been busy.
We've had the ES system up for about a month, I have no idea why there are shards with the 2014 date on it. Could this be because a server or two have the wrong system date? causing it to print 2014?

And the current system is winlog straight into ES, would logstash help with performance? the template file I'm using is completely default, are there any changes I should be making to it?

When I run this command:
PUT /winlogbeat-*/_settings
{
"settings": {
"number_of_shards" : 1,
"number_of_replicas" : 0
}
}

I get this return:

{
   "error": {
      "root_cause": [
         {
            "type": "illegal_argument_exception",
            "reason": "can't change the number of shards for an index"
         }
      ],
      "type": "illegal_argument_exception",
      "reason": "can't change the number of shards for an index"
   },
   "status": 400
}

However, I did at this line to the template file: "number_of_shards": 1
If I was to update the template file on all the servers runnning winlog, would this achieve the same goal?

Edit: removing the number of shards line and leaving number of replicas works and returns "acknowledged: true"

Topic		Replies	Views
Add data nodes to an existing cluster Elasticsearch	1	408	March 2, 2018
Increasing shards and then nodes Elasticsearch	12	869	July 6, 2017
Add data node to existing cluster with 3 masters and 2 other data nodes Elasticsearch	6	2564	September 6, 2018
Adding a new node to a cluster Elasticsearch	9	12263	July 5, 2017
Add New Node Elasticsearch	2	438	December 18, 2019

Adding a new node to ES

Related topics