Unassigned shards (inexperienced user!)


(Pastrufazio) #1

Hi all,

I inherited a Kibana+elasticsearch Amazon AWS server (one single node) from an ex colleague.

This instance is very slow and I'm trying to figure out what is wrong with it.

The first thing I noticed (see below) is that there seems to be 2271 shards unassigned.

What does it mean?

$ curl -XGET localhost:9200/_cat/allocation?v
shards disk.indices disk.used disk.avail disk.total disk.percent host      ip        node        
  2271        1.4gb    13.4gb     17.9gb     31.3gb           42 127.0.0.1 127.0.0.1 Shatterstar 
  2271                                                                               UNASSIGNED

(David Pilato) #2

As you don't have more than one node, replicas can't be allocated.

Not really a problem yet.

BTW did you look at https://www.elastic.co/cloud and https://aws.amazon.com/marketplace/pp/B01N6YCISK ?

Cloud by elastic is one way to have access to all features, all managed by us. Think about what is there yet like Security, Monitoring, Reporting, SQL and what is coming like Canvas...

You also probably have too many shards per node.

May I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

And https://www.elastic.co/webinars/using-rally-to-get-your-elasticsearch-cluster-size-right


(Pastrufazio) #3

Thank you very much David, I will take all your suggestions into account.


(Mark Walkom) #4

You have way too many shards for a single node, you should look to remove old indices and then change templates to use less shards.


(Pastrufazio) #5

Hi Mark,

do I need to remove old indices or is it enough to stop them? I'm asking this because yesterday I already closed all unnecessary indices. More precisely: I closed indices that at moment are not necessary (but they could be in a second moment).

Another two things:

  1. ALL MY OPEN INDICES ARE MARKED AD YELLOW...

I read that this is not a problem when using a single node. Right?

yellow open   branch_01_devices-2017.12.19       5   1       7883            0      1.7mb          1.7mb 
yellow open   branch_01_devices-2017.12.18       5   1       7909            0      1.6mb          1.6mb 
yellow open   branch_01_devices-2017.12.24       5   1       8476            0      2.1mb          2.1mb 
yellow open   branch_01_devices-2017.12.23       5   1       7998            0      1.8mb          1.8mb 
yellow open   branch_01_devices-2017.12.26       5   1       8928            0      2.2mb          2.2mb 
yellow open   branch_01_devices-2017.12.25       5   1       7773            0      1.7mb          1.7mb 
yellow open   branch_01_devices-2017.12.20       5   1       7910            0      1.8mb          1.8mb 
yellow open   branch_01_devices-2017.12.22       5   1       7990            0      1.8mb          1.8mb 
yellow open   branch_01_devices-2017.12.21       5   1       8016            0      1.8mb          1.8mb 
yellow open   branch_01_devices-2017.12.28       5   1      10186            0      2.6mb          2.6mb 
yellow open   branch_01_devices-2017.12.27       5   1       9295            0      2.3mb          2.3mb 
yellow open   branch_01_devices-2017.12.29       5   1      10784            0      2.7mb          2.7mb 
yellow open   branch_01_devices-2018.08          5   1     323742            0     68.2mb         68.2mb 
yellow open   branch_01_devices-2018.07          5   1     338342            0     68.7mb         68.7mb 
yellow open   branch_01_devices-2018.09          5   1     275612            0       58mb           58mb 
yellow open   branch_01_devices-2018.04          5   1        166            0    163.6kb        163.6kb 
yellow open   branch_01_devices-2018.06          5   1     144000            0     29.3mb         29.3mb 
yellow open   branch_01_devices-2017.12.02       5   1       8693            0      2.1mb          2.1mb 
yellow open   branch_01_devices-2017.12.01       5   1       7684            0      1.9mb          1.9mb 
yellow open   branch_01_devices-2017.12.04       5   1       6778            0      1.3mb          1.3mb 
yellow open   branch_01_devices-2017.12.03       5   1       7709            0      1.9mb          1.9mb 
yellow open   branch_01_devices-2017.12.09       5   1      15243            0      3.8mb          3.8mb 
yellow open   branch_01_devices-2017.12.06       5   1       7031            0      1.6mb          1.6mb 
yellow open   branch_01_devices-2017.12.05       5   1       7286            0      1.7mb          1.7mb 
yellow open   branch_01_devices-2017.12.08       5   1      14330            0      3.5mb          3.5mb 
yellow open   branch_01_devices-2017.12.07       5   1       8080            0        2mb            2mb 
yellow open   branch_01_devices-2018.11          5   1     151275            0     33.6mb         33.6mb 
yellow open   branch_01_devices-2018.10          5   1     269821            0     56.1mb         56.1mb 
yellow open   branch_01_devices-2010.02.28       5   1       5512            0      1.2mb          1.2mb 
yellow open   branch_01_devices-2010.02.27       5   1       5528            0      1.3mb          1.3mb 
yellow open   branch_01_devices-2010.02.22       5   1       5478            0      1.1mb          1.1mb 
yellow open   branch_01_devices-2010.02.21       5   1       5682            0      1.3mb          1.3mb 
yellow open   branch_01_devices-2010.02.20       5   1       5740            0      1.4mb          1.4mb 
yellow open   branch_01_devices-2010.02.26       5   1       5471            0        1mb            1mb 
yellow open   branch_01_devices-2010.02.25       5   1       5438            0      1.1mb          1.1mb 
yellow open   branch_01_devices-2010.02.24       5   1       5564            0      1.3mb          1.3mb 
yellow open   branch_01_devices-2010.02.23       5   1       5642            0      1.2mb          1.2mb 
yellow open   branch_01_devices-2010.02.19       5   1       5873            0      1.2mb          1.2mb 
yellow open   branch_01_devices-2010.02.18       5   1       5525            0      1.3mb          1.3mb
  1. AS YOU CAN SEE ABOVE, THERE ARE DAILY INDICES AND MONTHLY INDICES. WHICH IS PREFERABLE?

Our case is 7 branches with about ~50 devices each (tablets).
A server in every branch collect all the logs and send them via filebeat to the main server.
Here logstash elaborate them and pass everything to Kibana installed on the same machine.
Does it all make sense?
Do you see any weaknesses is this structure?

Thank you very much.


(Christian Dahlqvist) #6

Even the monthly indices are very small, so I would recommend switching to monthly indices across the board and reduce the number of primary shards for these to 1.

Then reindex the data in the older daily indices into such monthly indices and delete the daily indices once this is done. That should reduce the shard count significantly and leave you in a better place.


(Pastrufazio) #7

Thank you Christian,

having already had the benefit of your kindness, may I also ask you more questions?

I would recommend switching to monthly indices across the board

Correct me if I'm wrong, according to my logstash config file (below), the monthly indices are handled by logstash itself. All seems to provide the monthly log. We don't need to touch this file, right?

# FILE /etc/logstash/conf.d/logstash.conf

input {
  beats {
    port => 5044
    codec => json
  }
}


filter {

  date {
    match => ["Date", "YYYY-MM-dd HH:mm:ss"]
    target => "@timestamp"
  }

  date{
    match => ["visitStartTime", "HH:mm:ss"]
   }


  if [Application][Action]=="START"{
      mutate { add_tag => ["taskStarted"] 
      add_field => {"vId" => "%{[Application][visit_id]}"}}
  } else if [Application][Action] == "STOP" {
      mutate { add_tag => ["taskTerminated"]
      add_field => {"vId" => "%{[Application][visit_id]}"}}
  }  

  elapsed {
    start_tag => "taskStarted"
    end_tag => "taskTerminated"
    unique_id_field => "vId"
    timeout => 18000
    new_event_on_match => false
  }
 
}

output {
  elasticsearch{
    hosts => "localhost:9200"
    manage_template => false
    index => "%{[Location]}-%{+YYYY.MM}"
  }

}

...and reduce the number of primary shards for these to 1.

I don't know how to do it. is this a good starting point?
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-shrink-index.html

Then reindex the data in the older daily indices into such monthly indices and delete the daily indices once this is done.

Sorry, I'm a two days old filebeat-logstash-elasticsearch-kibana user. Any help would be really appreciated!

That should reduce the shard count significantly and leave you in a better place.

It sounds smooth and winning.
Thank you very much, Christian.