Puppet-elasticsearch service reports as started, but isn't started

I have created a manifest to install and configure elasticsearch via the puppetmodule that is referenced in the official documentation

Everything is installed fine, but the service isn't started even though it should be.

According to the documentation I need to have the status set to enabled to start the service at boot, and ensure it is always running:

status

node qkvd-elm01 {

  include defaults
  include ::java
  
  class { 'elasticsearch':
    version => '6.2.4',
    status => 'enabled',
    datadir_instance_directories => false,
    config => {
      'cluster.name'     => 'devops',
    }
  }

  elasticsearch::instance { 'qkvd-elm01':
    config => {
      'network.host'     => '_site_',
      'node.name'        => 'qkvd-elm01',
      'node.master'      => true,
      'node.data'        => false,
      'node.ingest'      => false,
      'node.ml'          => false,
      'xpack.ml.enabled' => false,
      'discovery.zen.minimum_master_nodes' => 2,
      'discovery.zen.ping.unicast.hosts'   => [
        "qkvd-elm01", 
        "qkvd-elm02", 
        "qkvd-elm03"
      ],
    }
  }
  
}

After finding that the service isn't started (even though puppet reports that it is), I ran puppet again in debug mode.

Debug: Executing: '/usr/bin/systemctl is-active elasticsearch-qkvd-elm01.service'
Debug: Executing: '/usr/bin/systemctl is-enabled elasticsearch-qkvd-elm01.service'
Debug: Executing: '/usr/bin/systemctl unmask elasticsearch-qkvd-elm01.service'
Debug: Executing: '/usr/bin/systemctl start elasticsearch-qkvd-elm01.service'
Debug: Executing: '/usr/bin/systemctl is-enabled elasticsearch-qkvd-elm01.service'
Notice: /Stage[main]/Main/Node[qkvd-elm01]/Elasticsearch::Instance[qkvd-elm01]/Elasticsearch::Service[qkvd-elm01]/Elasticsearch::Service::Systemd[qkvd-elm01]/Service[elasticsearch-instance-qkvd-elm01]/ensure: ensure changed 'stopped' to 'running'
Debug: /Stage[main]/Main/Node[qkvd-elm01]/Elasticsearch::Instance[qkvd-elm01]/Elasticsearch::Service[qkvd-elm01]/Elasticsearch::Service::Systemd[qkvd-elm01]/Service[elasticsearch-instance-qkvd-elm01]: The container Elasticsearch::Service::Systemd[qkvd-elm01] will propagate my refresh event
Info: /Stage[main]/Main/Node[qkvd-elm01]/Elasticsearch::Instance[qkvd-elm01]/Elasticsearch::Service[qkvd-elm01]/Elasticsearch::Service::Systemd[qkvd-elm01]/Service[elasticsearch-instance-qkvd-elm01]: Unscheduling refresh on Service[elasticsearch-instance-qkvd-elm01]
Debug: Elasticsearch::Service::Systemd[qkvd-elm01]: The container Elasticsearch::Service[qkvd-elm01] will propagate my refresh event

From what I can tell, it should be running, but it is not

[root@qkvd-elm01 ~]# systemctl status elasticsearch-qkvd-elm01.service
● elasticsearch-qkvd-elm01.service - Elasticsearch instance qkvd-elm01
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch-qkvd-elm01.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2019-03-27 12:19:56 IST; 22s ago
     Docs: http://www.elastic.co
  Process: 19263 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p /var/run/elasticsearch/elasticsearch-qkvd-elm01.pid --quiet (code=exited, status=1/FAILURE)
 Main PID: 19263 (code=exited, status=1/FAILURE)

I executed the same command that is sent to STDOUT when running puppet in debug /usr/bin/systemctl start elasticsearch-qkvd-elm01.service

Then checked the status again and found that the service is started.

[root@qkvd-elm01 ~]# systemctl status elasticsearch-qkvd-elm01.service
● elasticsearch-qkvd-elm01.service - Elasticsearch instance qkvd-elm01
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch-qkvd-elm01.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-03-27 12:20:38 IST; 1s ago
     Docs: http://www.elastic.co
 Main PID: 19334 (java)
   CGroup: /system.slice/elasticsearch-qkvd-elm01.service
           └─19334 /bin/java -Dfile.encoding=UTF-8 -Dio.netty.noKeySetOptimization=true -Dio.netty.noUnsafe=true -Dio.netty.recycler.maxCapacityPerThread=0 -Djava.awt.headless=true -Djna.nosys=true -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -XX:+AlwaysPreTouch -XX:+HeapDumpOnOutOfMemoryError -XX:+...

Mar 27 12:20:38 qkvd-elm01.kayhut.local systemd[1]: Started Elasticsearch instance qkvd-elm01.
Mar 27 12:20:38 qkvd-elm01.kayhut.local systemd[1]: Starting Elasticsearch instance qkvd-elm01...

What am I doing wrong?

Could be restart_on_change is false.

From CHANGELOG.md

*restart_on_change* now defaults to false to reduce unexpected cluster downtime (can be set to true if desired).

The INFO message does say Unscheduling refresh on Service

That is my best guess...

I always manually restart Elasticsearch just to be safe. As in, I do not let Puppet restart the service, just manage the config and packages.

Thank you for your assistance. You raise an interesting point. I am looking at it more from a point of view to ensure that the service is always running.

Again, many thanks.

For me the main problem for letting Puppet manage the service restarts would be that I'm afraid the cluster would collapse. There might not be enough master nodes available at one time or shards would be unassigned if they do not have time to reallocate.

So once packages and/or configs are updated I can do a Rolling Upgrade and because I have setup Shard Allocation Awareness I can restart 1/4 of the nodes in my "production" cluster at a time (one awareness "zone" at a time).

Hey @Alex_L - the module should be at least ensuring that the service is up; though @A_B is right, the module tries to be careful by not restarting services by default if necessary avoid causing downtime. If the module couldn't bring the service back up, there's a distinct possibility something else might've had troubles.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.