ElasticSearch 1.7.3 crashed and doesn't restart - no logs

thomas.dotreppe · December 1, 2015, 5:21pm

System info:

Ubuntu 14.04 64 bit LTS
ElasticSearch 1.7.3 from Elastic repo

I've been using that install for quite a while with no problem. It suddenly crashed for an unknown reason. Unfortunately, there is no log about the crash and starting it using 'service elasticsearch start' doesn't work; it hangs for about 10 seconds then fails and no logs are generated in /var/log/elasticsearch about it (yes, there are logs but there are from before it crashed).

Here are the last few lines in the log file if that's any useful (indexing_slowlog and search_slowlog are both empty):

[2015-11-30 10:23:17,551][INFO ][cluster.metadata         ] [Tomorrow Man] [logstash-2015.01.30] update_mapping [monit] (dynamic)
[2015-11-30 10:57:01,403][INFO ][cluster.metadata         ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [snort] (dynamic)
[2015-11-30 11:43:31,081][INFO ][cluster.metadata         ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [sophos] (dynamic)
[2015-11-30 12:26:45,955][INFO ][cluster.metadata         ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [fortinet] (dynamic)
[2015-11-30 12:28:10,743][INFO ][cluster.metadata         ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [fortinet] (dynamic)
[2015-11-30 13:36:47,798][INFO ][cluster.metadata         ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [auth] (dynamic)
[2015-11-30 13:37:04,885][INFO ][cluster.metadata         ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [auth] (dynamic)
[2015-11-30 14:34:35,672][INFO ][cluster.metadata         ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [iptables-dropped] (dynamic)
[2015-11-30 14:51:38,409][WARN ][monitor.jvm              ] [Tomorrow Man] [gc][young][2427240][80285] duration [1s], collections [1]/[1.7s], total [1s]/[1.7h], memory [10.6gb]->[10.2gb]/[18.9gb], all_pools {[young] [396.7mb]->[1mb]/[399.4mb]}{[survivor] [24.4mb]->[23.2mb]/[49.8mb]}{[old] [10.2gb]->[10.2gb]/[18.5gb]}
[2015-11-30 14:57:44,552][WARN ][monitor.jvm              ] [Tomorrow Man] [gc][young][2427567][80301] duration [4.1s], collections [1]/[5.1s], total [4.1s]/[1.7h], memory [10.6gb]->[10.2gb]/[18.9gb], all_pools {[young] [398.5mb]->[9.5mb]/[399.4mb]}{[survivor] [22.2mb]->[22.9mb]/[49.8mb]}{[old] [10.2gb]->[10.2gb]/[18.5gb]}

I checked to see if /var/run/elasticsearch had the right privileges/permission and it has (owned by 'elasticsearch' user):

~# ls /var/run/ -al | grep elastic
drwxr-xr-x  2 elasticsearch elasticsearch   60 Nov  2 12:23 elasticsearch

I found http://sandlininc.com/?p=747 and added the log line

log_daemon_msg "sudo -u $ES_USER $DAEMON $DAEMON_OPTS"

right before the startup. Starting it with the sudo command displayed on the screen works fine (and displays no errors) but starting it through 'service' still fails:

sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch.pid --default.config=/etc/elasticsearch/elasticsearch.yml --default.path.home=/usr/share/elasticsearch --default.path.logs=/var/log/elasticsearch --default.path.data=/var/lib/elasticsearch --default.path.work=/tmp/elasticsearch --default.path.conf=/etc/elasticsearch

Unfortunately, updating to 2.1.0 is not an option at this time since another tool we use depends on 1.7.3 to work.

As mentioned by someone else, there is more than enough disk space (more than 250Gb left). RAM isn't an issue either. the system has 24Gb and there is 19Gb given to ElasticSearch (ES_HEAP_SIZE=19g, in /etc/default/elasticsearch).

Is there anything else I can do or look for to debug this issue?

magnusbaeck · December 1, 2015, 5:51pm

I'd run the init script with -x so that it logs all commands being run:

sudo bash -x /etc/init.d/elasticsearch start

thomas.dotreppe · December 1, 2015, 6:44pm

I might be wrong but it doesn't look like it contains anything useful.

Here is the complete output (part 1/2):

+ PATH=/bin:/usr/bin:/sbin:/usr/sbin
+ NAME=elasticsearch
+ DESC='Elasticsearch Server'
+ DEFAULT=/etc/default/elasticsearch
++ id -u
+ '[' 0 -ne 0 ']'
+ . /lib/lsb/init-functions
+++ run-parts --lsbsysinit --list /lib/lsb/init-functions.d
++ for hook in '$(run-parts --lsbsysinit --list /lib/lsb/init-functions.d 2>/dev/null)'
++ '[' -r /lib/lsb/init-functions.d/20-left-info-blocks ']'
++ . /lib/lsb/init-functions.d/20-left-info-blocks
++ for hook in '$(run-parts --lsbsysinit --list /lib/lsb/init-functions.d 2>/dev/null)'
++ '[' -r /lib/lsb/init-functions.d/50-ubuntu-logging ']'
++ . /lib/lsb/init-functions.d/50-ubuntu-logging
+++ LOG_DAEMON_MSG=
++ FANCYTTY=
++ '[' -e /etc/lsb-base-logging.sh ']'
++ true
+ '[' -r /etc/default/rcS ']'
+ . /etc/default/rcS
++ UTC=yes
+ ES_USER=elasticsearch
+ ES_GROUP=elasticsearch
+ JDK_DIRS='/usr/lib/jvm/java-8-oracle/ /usr/lib/jvm/j2sdk1.8-oracle/ /usr/lib/jvm/jdk-7-oracle-x64 /usr/lib/jvm/java-7-oracle /usr/lib/jvm/j2sdk1.7-oracle/ /usr/lib/jvm/java-7-openjdk /usr/lib/jvm/java-7-openjdk-amd64/ /usr/lib/jvm/java-7-openjdk-armhf /usr/lib/jvm/java-7-openjdk-i386/ /usr/lib/jvm/default-java'
+ for jdir in '$JDK_DIRS'
+ '[' -r /usr/lib/jvm/java-8-oracle//bin/java -a -z '' ']'
+ JAVA_HOME=/usr/lib/jvm/java-8-oracle/
+ for jdir in '$JDK_DIRS'
+ '[' -r /usr/lib/jvm/j2sdk1.8-oracle//bin/java -a -z /usr/lib/jvm/java-8-oracle/ ']'
+ for jdir in '$JDK_DIRS'
+ '[' -r /usr/lib/jvm/jdk-7-oracle-x64/bin/java -a -z /usr/lib/jvm/java-8-oracle/ ']'
+ for jdir in '$JDK_DIRS'
+ '[' -r /usr/lib/jvm/java-7-oracle/bin/java -a -z /usr/lib/jvm/java-8-oracle/ ']'
+ for jdir in '$JDK_DIRS'
+ '[' -r /usr/lib/jvm/j2sdk1.7-oracle//bin/java -a -z /usr/lib/jvm/java-8-oracle/ ']'
+ for jdir in '$JDK_DIRS'
+ '[' -r /usr/lib/jvm/java-7-openjdk/bin/java -a -z /usr/lib/jvm/java-8-oracle/ ']'
+ for jdir in '$JDK_DIRS'
+ '[' -r /usr/lib/jvm/java-7-openjdk-amd64//bin/java -a -z /usr/lib/jvm/java-8-oracle/ ']'
+ for jdir in '$JDK_DIRS'
+ '[' -r /usr/lib/jvm/java-7-openjdk-armhf/bin/java -a -z /usr/lib/jvm/java-8-oracle/ ']'
+ for jdir in '$JDK_DIRS'
+ '[' -r /usr/lib/jvm/java-7-openjdk-i386//bin/java -a -z /usr/lib/jvm/java-8-oracle/ ']'
+ for jdir in '$JDK_DIRS'
+ '[' -r /usr/lib/jvm/default-java/bin/java -a -z /usr/lib/jvm/java-8-oracle/ ']'
+ export JAVA_HOME
+ ES_HOME=/usr/share/elasticsearch
+ MAX_OPEN_FILES=65535
+ LOG_DIR=/var/log/elasticsearch
+ DATA_DIR=/var/lib/elasticsearch
+ WORK_DIR=/tmp/elasticsearch
+ CONF_DIR=/etc/elasticsearch
+ CONF_FILE=/etc/elasticsearch/elasticsearch.yml
+ MAX_MAP_COUNT=262144
+ PID_DIR=/var/run/elasticsearch
+ '[' -f /etc/default/elasticsearch ']'
+ . /etc/default/elasticsearch
++ ES_HEAP_SIZE=19g
+ PID_FILE=/var/run/elasticsearch/elasticsearch.pid
+ DAEMON=/usr/share/elasticsearch/bin/elasticsearch
+ DAEMON_OPTS='-d -p /var/run/elasticsearch/elasticsearch.pid --default.config=/etc/elasticsearch/elasticsearch.yml --default.path.home=/usr/share/elasticsearch --default.path.logs=/var/log/elasticsearch --default.path.data=/var/lib/elasticsearch --default.path.work=/tmp/elasticsearch --default.path.conf=/etc/elasticsearch'
+ export ES_HEAP_SIZE
+ export ES_HEAP_NEWSIZE
+ export ES_DIRECT_SIZE
+ export ES_JAVA_OPTS
+ test -x /usr/share/elasticsearch/bin/elasticsearch
+ case "$1" in
+ checkJava
+ '[' -x /usr/lib/jvm/java-8-oracle//bin/java ']'
+ JAVA=/usr/lib/jvm/java-8-oracle//bin/java
+ '[' '!' -x /usr/lib/jvm/java-8-oracle//bin/java ']'
+ '[' -n '' -a -z 19g ']'
+ log_daemon_msg 'Starting Elasticsearch Server'
+ '[' -z 'Starting Elasticsearch Server' ']'
+ log_use_fancy_output
+ TPUT=/usr/bin/tput
+ EXPR=/usr/bin/expr
+ '[' -t 1 ']'
+ '[' xxterm '!=' x ']'
+ '[' xxterm '!=' xdumb ']'
+ '[' -x /usr/bin/tput ']'
+ '[' -x /usr/bin/expr ']'
+ /usr/bin/tput hpa 60
+ /usr/bin/tput setaf 1
+ '[' -z ']'
+ FANCYTTY=1
+ case "$FANCYTTY" in
+ true

thomas.dotreppe · December 1, 2015, 6:48pm

+ /usr/bin/tput xenl
++ /usr/bin/tput cols
+ COLS=144
+ '[' 144 ']'
+ '[' 144 -gt 6 ']'
++ /usr/bin/expr 144 - 7
+ COL=137
+ log_use_plymouth
+ '[' n = y ']'
+ plymouth --ping
+ printf ' * Starting Elasticsearch Server       '
 * Starting Elasticsearch Server       ++ /usr/bin/expr 144 - 1
+ /usr/bin/tput hpa 143
                                                                                                                                               + printf ' '
 ++ pidofproc -p /var/run/elasticsearch/elasticsearch.pid elasticsearch
++ local pidfile base status specified pid OPTIND
++ pidfile=
++ specified=
++ OPTIND=1
++ getopts p: opt
++ case "$opt" in
++ pidfile=/var/run/elasticsearch/elasticsearch.pid
++ specified=specified
++ getopts p: opt
++ shift 2
++ '[' 1 -ne 1 ']'
++ base=elasticsearch
++ '[' '!' specified ']'
++ '[' -n /var/run/elasticsearch/elasticsearch.pid -a -r /var/run/elasticsearch/elasticsearch.pid ']'
++ read pid
++ '[' -n '' ']'
++ '[' -n specified ']'
++ '[' -e /var/run/elasticsearch/elasticsearch.pid -a '!' -r /var/run/elasticsearch/elasticsearch.pid ']'
++ return 3
+ pid=
+ '[' -n '' ']'
+ mkdir -p /var/log/elasticsearch /var/lib/elasticsearch /tmp/elasticsearch
+ chown elasticsearch:elasticsearch /var/log/elasticsearch /var/lib/elasticsearch /tmp/elasticsearch
+ '[' -n /var/run/elasticsearch ']'
+ '[' '!' -e /var/run/elasticsearch ']'
+ '[' -n /var/run/elasticsearch/elasticsearch.pid ']'
+ '[' '!' -e /var/run/elasticsearch/elasticsearch.pid ']'
+ '[' -n 65535 ']'
+ ulimit -n 65535
+ '[' -n '' ']'
+ '[' -n 262144 -a -f /proc/sys/vm/max_map_count ']'
+ sysctl -q -w vm.max_map_count=262144
+ log_daemon_msg 'sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch.pid --default.config=/etc/elasticsearch/elasticsearch.yml --default.path.home=/usr/share/elasticsearch --default.path.logs=/var/log/elasticsearch --default.path.data=/var/lib/elasticsearch --default.path.work=/tmp/elasticsearch --default.path.conf=/etc/elasticsearch'
+ '[' -z 'sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch.pid --default.config=/etc/elasticsearch/elasticsearch.yml --default.path.home=/usr/share/elasticsearch --default.path.logs=/var/log/elasticsearch --default.path.data=/var/lib/elasticsearch --default.path.work=/tmp/elasticsearch --default.path.conf=/etc/elasticsearch' ']'
+ log_use_fancy_output
+ TPUT=/usr/bin/tput
+ EXPR=/usr/bin/expr
+ '[' -t 1 ']'
+ '[' xxterm '!=' x ']'
+ '[' xxterm '!=' xdumb ']'
+ '[' -x /usr/bin/tput ']'
+ '[' -x /usr/bin/expr ']'
+ /usr/bin/tput hpa 60
+ /usr/bin/tput setaf 1
+ '[' -z 1 ']'
+ true
+ case "$FANCYTTY" in
+ true
+ /usr/bin/tput xenl
++ /usr/bin/tput cols
+ COLS=144
+ '[' 144 ']'
+ '[' 144 -gt 6 ']'
++ /usr/bin/expr 144 - 7
+ COL=137
+ log_use_plymouth
+ '[' n = y ']'
+ plymouth --ping
+ printf ' * sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch.pid --default.config=/etc/elasticsearch/elasticsearch.yml --default.path.home=/usr/share/elasticsearch --default.path.logs=/var/log/elasticsearch --default.path.data=/var/lib/elasticsearch --default.path.work=/tmp/elasticsearch --default.path.conf=/etc/elasticsearch       '
 * sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch.pid --default.config=/etc/elasticsearch/elasticsearch.yml --default.path.home=/usr/share/elasticsearch --default.path.logs=/var/log/elasticsearch --default.path.data=/var/lib/elasticsearch --default.path.work=/tmp/elasticsearch --default.path.conf=/etc/elasticsearch       ++ /usr/bin/expr 144 - 1
+ /usr/bin/tput hpa 143
                                                                                                                                               + printf ' '
 + start-stop-daemon --start -b --user elasticsearch -c elasticsearch --pidfile /var/run/elasticsearch/elasticsearch.pid --exec /usr/share/elasticsearch/bin/elasticsearch -- -d -p /var/run/elasticsearch/elasticsearch.pid --default.config=/etc/elasticsearch/elasticsearch.yml --default.path.home=/usr/share/elasticsearch --default.path.logs=/var/log/elasticsearch --default.path.data=/var/lib/elasticsearch --default.path.work=/tmp/elasticsearch --default.path.conf=/etc/elasticsearch
+ return=0
+ '[' 0 -eq 0 ']'
+ i=0
+ timeout=10
+ exit 0

magnusbaeck · December 1, 2015, 7:05pm

Hmm, okay. It looks like it's actually trying to start ES (i.e. the init script isn't dying earlier on). I'd try sudo strace -f bash /etc/init.d/elasticsearch start (pipe stdout and stderr to a file!) to get further clues, but beware that reading the strace output isn't necessarily easy.

thomas.dotreppe · December 1, 2015, 7:41pm

Where would you like me to send it to? The file is fairly large (~500K uncompressed, ~250K compressed).

magnusbaeck · December 1, 2015, 8:30pm

I was hoping you'd be able to analyze it yourself. I don't have time to do it. Maybe someone else here can chip in. I'd share the file on Google Drive, Dropbox, or a similar service.

thomas.dotreppe · December 1, 2015, 8:54pm

I'll open a bug report.

warkolm · December 1, 2015, 9:21pm

Run this yourself, it might tell you something more;

sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch.pid --default.config=/etc/elasticsearch/elasticsearch.yml --default.path.home=/usr/share/elasticsearch --default.path.logs=/var/log/elasticsearch --default.path.data=/var/lib/elasticsearch --default.path.work=/tmp/elasticsearch --default.path.conf=/etc/elasticsearch

thomas.dotreppe · December 1, 2015, 10:11pm

warkolm, I mentioned that I did it in my first post and that works fine (and doesn't return logs). Running it via 'service' (or /etc/init.d/elasticsearch) fails.

Clinton_Gormley · December 2, 2015, 11:37am

... in which case I'd try the same thing that @warkolm suggested but remove the -d option to run Elasticsearch in the foreground (and log to STDERR).

Also check what you have in /etc/default/elasticsearch as that is used when starting as a service. Btw, 19GB of heap out of 24GB total is not a good ratio, you are limiting the effectiveness of the file system cache. I'd also check what else is using memory on your system in case you have processes which are competing with each other.

thomas.dotreppe · December 2, 2015, 4:10pm

His command was with the -d but I already tried that suggestion (someone else suggested that yesterday on IRC) without it that and it starts just fine. I can give you the output if you'd like.

Should I drop to 12Gb?

thomas.dotreppe · December 2, 2015, 6:36pm

What should I try next?

msimos · December 2, 2015, 9:54pm

Hi,

I'd edit /etc/elasticsearch/logging.yml and change the log level to DEBUG:

es.logger.level: DEBUG

From the initial post it seems like you maybe only logging at INFO level. If you set this to DEBUG you may get some additional information as to whats happening. Then restart Elasticsearch and check the log file again.

thomas.dotreppe · December 2, 2015, 10:55pm

I stopped it, added that line in /etc/elasticsearch/logging.yml (just had to edit the second line and replace INFO by DEBUG), cleared the log directory and started it using 'service elasticsearch start'.

It hung for 5-10 secs but it didn't generate any logs.

Would it be possible it is a problem with the JDK?

msimos · December 2, 2015, 11:03pm

Hi,

Run:

sudo strace -s 99 -f bash /etc/init.d/elasticsearch start

Then paste the last 10 lines here after waiting about 1 minute.

thomas.dotreppe · December 2, 2015, 11:22pm

I output stdout and stderr to a file that I kept just in case. It exited by itself within 10 seconds. Anyway, here are the 10 lines as requested:

rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f9b9938ad40}, {0x4438a0, [], SA_RESTORER, 0x7f9b9938ad40}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=25287, si_status=0, si_utime=0, si_stime=0} ---
wait4(-1, 0x7ffe20c75ed8, WNOHANG, NULL) = -1 ECHILD (No child processes)
rt_sigreturn()                          = 0
write(1, "   ...fail!\n", 12)           = 12
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
exit_group(1)                           = ?
+++ exited with 1 +++

msimos · December 2, 2015, 11:49pm

Hi,

You probably need to provide some earlier messages since it just shows it failed. Doesn't give any indication why. You can use something like pastebin to dump the whole thing.

thomas.dotreppe · December 2, 2015, 11:59pm

It's too big for pastebin. The file is 5.41Mb. Do you have an email I can it to?

msimos · December 3, 2015, 12:12am

Try using google drive, box, gist.github.com, etc..

Topic		Replies	Views
Instant crash on startup Elasticsearch	14	5180	July 4, 2012
Elastic can not start because of memory issue Elasticsearch	15	9492	August 31, 2021
Elasticsearch process ended by code 137 Elasticsearch	7	29520	October 19, 2023
Can somebody tell me what is in this log? Elasticsearch	6	967	July 13, 2021
/etc/init.d/elasticsearch fails at system boot Elasticsearch	4	1677	May 7, 2014

ElasticSearch 1.7.3 crashed and doesn't restart - no logs

Related topics