Elastic Search Not Running on Linux


#1

Hi,
Elastic search server stops working abruptly, and throws this message in the logs:
Could you please let me know what could be the reason? Thanks !

org.elasticsearch.transport.ReceiveTimeoutTransportException: [vCcPM9Y][10.20.158.239:9300][cluster:monitor/nodes/stats[n]] request_id [14457] timed out after [21558ms]
	at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:908) [elasticsearch-5.2.0.jar:5.2.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:527) [elasticsearch-5.2.0.jar:5.2.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
[2018-03-23T07:41:46,418][WARN ][o.e.a.a.c.n.s.TransportNodesStatsAction] [vCcPM9Y] not accumulating exceptions, excluding exception from response
org.elasticsearch.action.FailedNodeException: Failed node [vCcPM9YkQ2e6C4p6Q6V_UQ]
	at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.onFailure(TransportNodesAction.java:247) [elasticsearch-5.2.0.jar:5.2.0]
	at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.access$300(TransportNodesAction.java:160) [elasticsearch-5.2.0.jar:5.2.0]
	at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction$1.handleException(TransportNodesAction.java:219) [elasticsearch-5.2.0.jar:5.2.0]
	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1024) [elasticsearch-5.2.0.jar:5.2.0]
	at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:907) [elasticsearch-5.2.0.jar:5.2.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:527) [elasticsearch-5.2.0.jar:5.2.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: org.elasticsearch.transport.ReceiveTimeoutTransportException: [vCcPM9Y][10.20.158.239:9300][cluster:monitor/nodes/stats[n]] request_id [14457] timed out after [21558ms]
	at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:908) ~[elasticsearch-5.2.0.jar:5.2.0]

Top Command Gives the below message:

25327 elastic+ 20 0 0.380t 6.308g 1.975g S 99.7 80.9 24:00.08 java

free -h command gives:

                total        used        free      shared  buff/cache   available

Mem: 7.8G 4.7G 216M 6.0M 2.9G 2.8G
Swap: 2.0G 9.3M 2.0G

df:

Filesystem                    1K-blocks      Used Available Use% Mounted on
udev                            4068616         0   4068616   0% /dev
tmpfs                            817560      9088    808472   2% /run
/dev/mapper/ubuntu16--vg-root 513372752 404445564  87651040  83% /
tmpfs                           4087788         0   4087788   0% /dev/shm
tmpfs                              5120         0      5120   0% /run/lock
tmpfs                           4087788         0   4087788   0% /sys/fs/cgroup
/dev/sda1                        482922     56270    401718  13% /boot
tmpfs                            817560         0    817560   0% /run/user/1000

(Harsh Bajaj) #2

Hi,

pls let me know what version of elasticsearch you are using and what is in your config file??


#3

Hi @harshbajaj16

The Version is 5.2.0
and here is the Config file -

    ################################
    # Elasticsearch
    ################################

    # Elasticsearch home directory
    #ES_HOME=/usr/share/elasticsearch

    # Elasticsearch Java path
    #JAVA_HOME=

    # Elasticsearch configuration directory
    #CONF_DIR=/etc/elasticsearch

    # Elasticsearch data directory
    #DATA_DIR=/var/lib/elasticsearch

    # Elasticsearch logs directory
    #LOG_DIR=/var/log/elasticsearch

    # Elasticsearch PID directory
    #PID_DIR=/var/run/elasticsearch

    # Additional Java OPTS
    #ES_JAVA_OPTS=

    # Configure restart on package upgrade (true, every other setting will lead to not restarting)
    #RESTART_ON_UPGRADE=true


    # The number of seconds to wait before checking if Elasticsearch started successfully as a daemon process
    ES_STARTUP_SLEEP_TIME=5

The heap size in the jvm.options file has set as -Xms4G and -Xmx4G


(Harsh Bajaj) #4

pls share elasticsearch.yml file here.?/


#5
     #Use a descriptive name for the node:
      #
      #node.name: node-1
      #
      # Add custom attributes to the node:
      #
      #node.attr.rack: r1
      #
      # ----------------------------------- Paths ------------------------------------
      #
      # Path to directory where to store the data (separate multiple locations by comma):
      #
     #path.data: /path/to/data
      #
      # Path to log files:
      #
      #path.logs: /path/to/logs
      #
      # ----------------------------------- Memory -----------------------------------
      #
      # Lock the memory on startup:
      #
      bootstrap.memory_lock: true
      #
      # Make sure that the heap size is set to about half the memory available
      # on the system and that the owner of the process is allowed to use this
      # limit.
      #
      # Elasticsearch performs poorly when the system is swapping the memory.
      #
      # ---------------------------------- Network -----------------------------------
      #
      # Set the bind address to a specific IP (IPv4 or IPv6):
      #
      network.host: 0.0.0.0
      #
      # Set a custom port for HTTP:
      #
      #http.port: 9200
      #
      # For more information, consult the network module documentation.
      #
      # --------------------------------- Discovery ----------------------------------
      #
      # Pass an initial list of hosts to perform discovery when new node is started:
      # The default list of hosts is ["127.0.0.1", "[::1]"]
     #
     #discovery.zen.ping.unicast.hosts: ["host1", "host2"]
      #
      # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):

(Harsh Bajaj) #6

just set your network host and uncomment PORT and try to start elasticsearch and share which error message you are getting.?


(David Pilato) #7

Please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.
Please update your post.


#8

Sure @dadoonet. I believe I formatted the text in the right way. Thanks !
Could you please help me in resolving the issue.

And from the logs, I see GC is happening:

 [2018-03-23T00:01:13,973][INFO ][o.e.m.j.JvmGcMonitorService] [vCcPM9Y] [gc][old][14000][13188] duration [7.4s], collections [1]/[7.7s], total [7.4s]/[1.4d], memory [3.9gb]->[3.9gb]/[3.9gb], all_pools {[young] [15.7mb]->[21.8mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [3.9gb]->[3.9gb]/[3.9gb]}
[2018-03-23T00:01:13,973][WARN ][o.e.m.j.JvmGcMonitorService] [vCcPM9Y] [gc][14000] overhead, spent [7.4s] collecting in the last [7.7s]
[2018-03-23T00:01:25,527][WARN ][o.e.m.j.JvmGcMonitorService] [vCcPM9Y] [gc][old][14001][13189] duration [11.3s], collections [1]/[11.5s], total [11.3s]/[1.4d], memory [3.9gb]->[3.9gb]/[3.9gb], all_pools {[young] [21.8mb]->[18.4mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [3.9gb]->[3.9gb]/[3.9gb]}

(David Pilato) #9

Sounds like you are running short on memory.

Could you share how many indices, shards you have?
How many nodes?


#10

@dadoonet, I do have only one node.

I've ran Cat/allocation API and results are like this:

shards disk.indices disk.used disk.avail disk.total disk.percent host          ip            node
  1896        384gb   406.7gb     82.8gb    489.5gb           83

Get Cat/_Indices API turns up to 381.


(David Pilato) #11

You probably have too many shards per node.

May I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing


(system) closed #12

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.