ES 0.20.4 cluster can't start

arta · February 8, 2013, 3:19am

Today I upgraded from 0.19.3 to 0.20.3.
I also upgraded from jdk 1.6.0_14 to 1.7.0_13.
I did full cluster shutdown, replaced ES, then started the cluster.

Once this happens, the number_of_nodes in /_cluster/health increases.
I have 19 nodes, then it goes 18 when a node is restarting, then goes to 20 (not 19).

elasticsearch/bin/service/elasticsearch.conf has jvm config like this:
wrapper.java.additional.1=-Delasticsearch-service
wrapper.java.additional.2=-Des.path.home=%ES_HOME%
wrapper.java.additional.3=-Xss256k
wrapper.java.additional.4=-XX:+UseG1GC
wrapper.java.additional.5=-XX:+HeapDumpOnOutOfMemoryError
wrapper.java.additional.6=-Djava.awt.headless=true
wrapper.java.additional.7=-XX:PermSize=256m
wrapper.java.additional.8=-XX:MaxPermSize=256m
wrapper.java.additional.9=-Dcom.sun.management.jmxremote
wrapper.java.additional.10=-Dcom.sun.management.jmxremote.port=3333
wrapper.java.additional.11=-Dcom.sun.management.jmxremote.authenticate=false
wrapper.java.additional.12=-Dcom.sun.management.jmxremote.ssl=false

Two questions:

what are the possible reason the wrapper says - JVM received a signal UNKNOWN (6)?
what does having more number_of_nodes in _cluster/health indicate?

Thanks in advance.

arta · February 8, 2013, 3:22am

Sorry, one correction. I upgraded to 0.20.4.

vineeth_mohan · February 8, 2013, 5:08am

Can you provide the debug log

Thanks
Vineeth

On Fri, Feb 8, 2013 at 8:52 AM, arta artasano@sbcglobal.net wrote:

Sorry, one correction. I upgraded to 0.20.4.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ES-0-20-4-cluster-can-t-start-tp4029492p4029493.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

arta · February 8, 2013, 5:10am

I found crash dump? files, so I gist'ed

gist.github.com

https://gist.github.com/artaa/4736741

hs_err_pid12759.log

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00002aaaab6b8f07, pid=12759, tid=1157581120
#
# JRE version: 7.0_13-b20
# Java VM: Java HotSpot(TM) 64-Bit Server VM (23.7-b01 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# J  org.elasticsearch.common.trove.impl.hash.TObjectHash.insertKey(Ljava/lang/Object;)I
#

This file has been truncated. show original

gist.github.com

https://gist.github.com/artaa/4736759

hs_err_pid19177.log

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00002aaaab6ec094, pid=19177, tid=1074166080
#
# JRE version: 7.0_13-b20
# Java VM: Java HotSpot(TM) 64-Bit Server VM (23.7-b01 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# J  org.elasticsearch.common.trove.map.hash.TObjectIntHashMap.put(Ljava/lang/Object;I)I
#

This file has been truncated. show original

Hope it helps.

arta · February 8, 2013, 5:36am

ES log around the restart time (matches with the first crash dump time) is gist'ed here:

gist.github.com

https://gist.github.com/artaa/4736880

gistfile1.txt

[2013-02-07 18:55:51,931][DEBUG][gateway.local            ] [Valkyrie] [i14][28]: forcing allocating [[i14][28], node[null], [P], s[UNASSIGNED]] to [[Firepower][Mf9jtX8xQ-aGggLuOGqsiQ][inet[/10.5.124.110:9300]]] on primary allocation
[2013-02-07 18:55:51,934][DEBUG][gateway.local            ] [Valkyrie] [i14][31]: forcing allocating [[i14][31], node[null], [P], s[UNASSIGNED]] to [[Firepower][Mf9jtX8xQ-aGggLuOGqsiQ][inet[/10.5.124.110:9300]]] on primary allocation
[2013-02-07 18:55:51,937][DEBUG][gateway.local            ] [Valkyrie] [i15][0]: forcing allocating [[i15][0], node[null], [P], s[UNASSIGNED]] to [[Firepower][Mf9jtX8xQ-aGggLuOGqsiQ][inet[/10.5.124.110:9300]]] on primary allocation
[2013-02-07 18:55:51,940][DEBUG][gateway.local            ] [Valkyrie] [i15][2]: forcing allocating [[i15][2], node[null], [P], s[UNASSIGNED]] to [[Grey, Nate][S-KBjvWvQneQesLO8zGRCw][inet[/10.5.120.45:9300]]] on primary allocation
[2013-02-07 18:55:51,943][DEBUG][gateway.local            ] [Valkyrie] [i15][3]: forcing allocating [[i15][3], node[null], [P], s[UNASSIGNED]] to [[Crazy Eight][D2cG8645RwOeZF0B1pijvg][inet[/10.5.120.63:9300]]] on primary allocation
[2013-02-07 18:55:51,946][DEBUG][gateway.local            ] [Valkyrie] [i15][6]: forcing allocating [[i15][6], node[null], [P], s[UNASSIGNED]] to [[Firepower][Mf9jtX8xQ-aGggLuOGqsiQ][inet[/10.5.124.110:9300]]] on primary allocation
[2013-02-07 18:55:51,949][DEBUG][gateway.local            ] [Valkyrie] [i15][8]: forcing allocating [[i15][8], node[null], [P], s[UNASSIGNED]] to [[Firepower][Mf9jtX8xQ-aGggLuOGqsiQ][inet[/10.5.124.110:9300]]] on primary allocation
[2013-02-07 18:55:58,636][INFO ][node                     ] [Kala] {0.20.4}[14113]: initializing ...
[2013-02-07 18:55:58,644][INFO ][plugins                  ] [Kala] loaded [], sites []
[2013-02-07 18:56:00,904][DEBUG][gateway.local            ] [Kala] using initial_shards [quorum], list_timeout [30s]

This file has been truncated. show original

kimchy · February 12, 2013, 10:38pm

Try removing the usage of G1 garbage collector, based on reports we get, its not stable yet. Prefer to stick with the default settings that come out of the box.

Also, if you can get away from using JMX, and using ES stats APIs, it would be even better

On Feb 8, 2013, at 6:36 AM, arta artasano@sbcglobal.net wrote:

ES log around the restart time (matches with the first crash dump time) is
gist'ed here:
gist:4736880 · GitHub

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/ES-0-20-4-cluster-can-t-start-tp4029492p4029496.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

arta · February 12, 2013, 11:16pm

Thank you Kimchy.
I will definitely try that.

Topic		Replies	Views
ES is not starting after upgrading it to 7.9.2 Elasticsearch	9	5908	October 25, 2020
Can't Start ES 5.2 After Upgrade Elasticsearch	6	1970	March 2, 2017
Unable to start Elasticsearch and Cluster is down Elasticsearch	8	4016	July 5, 2017
Error: Could not create the Java Virtual Machine. ES won't start Elasticsearch	3	1919	November 28, 2022
Cluster won't start after upgrade from 7.13.1 to 7.13.2 Elasticsearch	9	1130	July 17, 2021

ES 0.20.4 cluster can't start

Related topics