All of my shards suddenly unassigned


(aviv ratzon) #1

I am running a single node of elasticsearch 2.2, and upon restarting the node all of the sudden all the shards got unassigned. This is the cluster health stats:

{
  "cluster_name" : "elasticsearch",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 4,
  "unassigned_shards" : 108,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 0.0
}

I've read about this problem happening on SOME of the shards on a multiple nodes cluster but I'm running a much simpler instance and this seems realy weird.


(David Pilato) #2

What do you have in logs?


(aviv ratzon) #3

it keeps filling up with these:

Caused by: java.nio.file.NoSuchFileException: C:\Users\tc99138\Desktop\elasticsearch\data\elasticsearch\nodes\0\indices\logstash-de\0\index_u_Lucene50_0.tip
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:79)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
at sun.nio.fs.WindowsFileSystemProvider.newFileChannel(WindowsFileSystemProvider.java:115)
at java.nio.channels.FileChannel.open(FileChannel.java:287)
at java.nio.channels.FileChannel.open(FileChannel.java:335)
at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:238)
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:89)
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:89)
at org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.(BlockTreeTermsReader.java:173)
at org.apache.lucene.codecs.lucene50.Lucene50PostingsFormat.fieldsProducer(Lucene50PostingsFormat.java:446)
at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:261)
at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:341)
at org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:104)
at org.apache.lucene.index.SegmentReader.(SegmentReader.java:65)
at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:145)
at org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:197)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:99)
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:435)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:86)
at org.elasticsearch.index.engine.InternalEngine.createSearcherManager(InternalEngine.java:296)
... 12 more


(aviv ratzon) #4

I tried deleting the index logstash-de, and now this is the health status:

{
"cluster_name" : "elasticsearch",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 50,
"active_shards" : 50,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 52,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 49.01960784313725
}

only the replica shards are unassigned


(Mark Walkom) #5

Replica shards will never assign when you have a single node, it defeats the purpose of having both primary and replica shards on the same node.


(aviv ratzon) #6

So why is the status still red?


(Mark Walkom) #7

Take a look at _cat/shards, you probably have an unassigned primary still.


(aviv ratzon) #8

Only the kibana index (I don't know if its supposed to be so):

.kibana 0 p UNASSIGNED
.kibana 0 r UNASSIGNED

This is the log;

[2016-03-22 11:06:26,137][INFO ][node ] [Visimajoris] initialized
[2016-03-22 11:06:26,137][INFO ][node ] [Visimajoris] starting ...
[2016-03-22 11:06:26,363][INFO ][transport ] [Visimajoris] publish_address {127.0.0.1:9300}, bound_addresses {[::]:9300}
[2016-03-22 11:06:26,370][INFO ][discovery ] [Visimajoris] elasticsearch/HaBNtgVlSM-EbpHFbb8uhg
[2016-03-22 11:06:30,436][INFO ][cluster.service ] [Visimajoris] new_master {Visimajoris}{HaBNtgVlSM-EbpHFbb8uhg}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-03-22 11:06:30,567][INFO ][http ] [Visimajoris] publish_address {127.0.0.1:9200}, bound_addresses {[::]:9200}
[2016-03-22 11:06:30,568][INFO ][node ] [Visimajoris] started
[2016-03-22 11:06:30,630][INFO ][gateway ] [Visimajoris] recovered [11] indices into cluster_state


#9

I think that you should update your cluster settings by setting the replicas to 0 :

PUT /_settings
    {
        "index" : {
            "number_of_replicas" : 0
        }
    }

This should solve the problem.
make sure also that you did not disable automatic shard allocation in your yml
cluster.routing.allocation.enable
more here : https://www.elastic.co/guide/en/elasticsearch/reference/current/shards-allocation.html


(aviv ratzon) #10

Thanks everyone for your answers! after disabling the replicas and deleting the kibana index it worked.

Much appreciated!


(system) #11