System info:
- Ubuntu 14.04 64 bit LTS
- ElasticSearch 1.7.3 from Elastic repo
I've been using that install for quite a while with no problem. It suddenly crashed for an unknown reason. Unfortunately, there is no log about the crash and starting it using 'service elasticsearch start' doesn't work; it hangs for about 10 seconds then fails and no logs are generated in /var/log/elasticsearch about it (yes, there are logs but there are from before it crashed).
Here are the last few lines in the log file if that's any useful (indexing_slowlog and search_slowlog are both empty):
[2015-11-30 10:23:17,551][INFO ][cluster.metadata ] [Tomorrow Man] [logstash-2015.01.30] update_mapping [monit] (dynamic)
[2015-11-30 10:57:01,403][INFO ][cluster.metadata ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [snort] (dynamic)
[2015-11-30 11:43:31,081][INFO ][cluster.metadata ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [sophos] (dynamic)
[2015-11-30 12:26:45,955][INFO ][cluster.metadata ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [fortinet] (dynamic)
[2015-11-30 12:28:10,743][INFO ][cluster.metadata ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [fortinet] (dynamic)
[2015-11-30 13:36:47,798][INFO ][cluster.metadata ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [auth] (dynamic)
[2015-11-30 13:37:04,885][INFO ][cluster.metadata ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [auth] (dynamic)
[2015-11-30 14:34:35,672][INFO ][cluster.metadata ] [Tomorrow Man] [logstash-2015.11.30] update_mapping [iptables-dropped] (dynamic)
[2015-11-30 14:51:38,409][WARN ][monitor.jvm ] [Tomorrow Man] [gc][young][2427240][80285] duration [1s], collections [1]/[1.7s], total [1s]/[1.7h], memory [10.6gb]->[10.2gb]/[18.9gb], all_pools {[young] [396.7mb]->[1mb]/[399.4mb]}{[survivor] [24.4mb]->[23.2mb]/[49.8mb]}{[old] [10.2gb]->[10.2gb]/[18.5gb]}
[2015-11-30 14:57:44,552][WARN ][monitor.jvm ] [Tomorrow Man] [gc][young][2427567][80301] duration [4.1s], collections [1]/[5.1s], total [4.1s]/[1.7h], memory [10.6gb]->[10.2gb]/[18.9gb], all_pools {[young] [398.5mb]->[9.5mb]/[399.4mb]}{[survivor] [22.2mb]->[22.9mb]/[49.8mb]}{[old] [10.2gb]->[10.2gb]/[18.5gb]}
I checked to see if /var/run/elasticsearch had the right privileges/permission and it has (owned by 'elasticsearch' user):
~# ls /var/run/ -al | grep elastic
drwxr-xr-x 2 elasticsearch elasticsearch 60 Nov 2 12:23 elasticsearch
I found http://sandlininc.com/?p=747 and added the log line
log_daemon_msg "sudo -u $ES_USER $DAEMON $DAEMON_OPTS"
right before the startup. Starting it with the sudo command displayed on the screen works fine (and displays no errors) but starting it through 'service' still fails:
sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch.pid --default.config=/etc/elasticsearch/elasticsearch.yml --default.path.home=/usr/share/elasticsearch --default.path.logs=/var/log/elasticsearch --default.path.data=/var/lib/elasticsearch --default.path.work=/tmp/elasticsearch --default.path.conf=/etc/elasticsearch
Unfortunately, updating to 2.1.0 is not an option at this time since another tool we use depends on 1.7.3 to work.
As mentioned by someone else, there is more than enough disk space (more than 250Gb left). RAM isn't an issue either. the system has 24Gb and there is 19Gb given to ElasticSearch (ES_HEAP_SIZE=19g, in /etc/default/elasticsearch).
Is there anything else I can do or look for to debug this issue?