TL;DR: will data nodes be offline during reindexing after a major upgrade (1.x -> 2.x)?
Long story:
Our DEV cluster has 4 nodes - 2 pure data nodes, 1 master and 1 client-only node.
I understand that after upgrading the indexes will be rebuilt (new version of Lucene, etc.), and also that I had to setup unicast since multicast is no longer used for discovery.
Where the value in the unicast.hosts property is the IP address of the single master node. The above are set the same on all four boxes.
So after upgrading all the nodes, I see the two data nodes are maxing out CPU frequently, seem active in rebuilding the index (which I understood to be normal). The issue is that when I hit the client node the only available nodes on the cluster appear to be my master & client nodes; the data nodes aren't showing as part of the cluster (looking at /_nodes?pretty)
Is this normal? Will the data nodes be offline until reindexing is complete?
It doesn't reindex, it may upgrade the underlying segments to the latest lucene version but only if they merge. I don't know how long this takes though TBH.
I let it run overnight and while the high CPU activity appears to have settled but the logs are littered with these messages:
ElasticsearchException[failed to flush exporter bulks] at org.elasticsearch.marvel.agent.exporter.ExportBulk$Compound.flush(ExportBulk.java:104) at org.elasticsearch.marvel.agent.exporter.ExportBulk.close(ExportBulk.java:53) at org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:201) at java.lang.Thread.run(Thread.java:745) Suppressed: ElasticsearchException[failed to flush [default_local] exporter bulk]; nested: ElasticsearchException[failure in bulk execution: [0]: index [.marvel-es-2016.02.22], type [node_stats], id [AVMLbBxGhr_mPmdg1NfF], message [UnavailableShardsException[[.marvel-es-2016.02.22][0] primary shard is not active Timeout: [1m], request: [shard bulk {[.marvel-es-2016.02.22][0]}]]]]; at org.elasticsearch.marvel.agent.exporter.ExportBulk$Compound.flush(ExportBulk.java:106) ... 3 more Caused by: ElasticsearchException[failure in bulk execution: [0]: index [.marvel-es-2016.02.22], type [node_stats], id [AVMLbBxGhr_mPmdg1NfF], message [UnavailableShardsException[[.marvel-es-2016.02.22][0] primary shard is not active Timeout: [1m], request: [shard bulk {[.marvel-es-2016.02.22][0]}]]]] at org.elasticsearch.marvel.agent.exporter.local.LocalBulk.flush(LocalBulk.java:114) at org.elasticsearch.marvel.agent.exporter.ExportBulk$Compound.flush(ExportBulk.java:101) ... 3 more
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.