Cluster restart deletes xpack license and kibana indexes


#1

Hi Everyone,

I am running a Elasticsearch cluster deployed in Docker containers. When I restarted the cluster the Elasticsearch license appeared to be deleted along with all of the Kibana dashboards. Anyone know why this occurred and how to prevent in the future?

This was originally asked here but was closed due to inactivity. I am reopening this issue as it is a big problem for us. Would definitely appreciate an answer.

Thanks

--John


(David Turner) #3

In your previous thread you indicated that you're using persistent volumes for your containers. However, the most likely explanation for this is that Elasticsearch is storing its data somewhere ephemeral within the container, and not on the persistent volumes. Can you check that the volumes are being mounted where you expect them to be, that Elasticsearch is configured to use them as its data path, and that Elasticsearch is creating files in those volumes (at least a folder called nodes)?


#4

Hi David,

Thanks so much for following up so quickly! Excellent questions and yes, the nodes folders are being created as expected on the host machines. What's weird is that all of the other indices--namely the documents we are loading into ES and searching on--are retained but I am gathering the kibana and xpack indices are not. Are there any settings you can think of that may explain why only the kibana and xpack indices are not retained? Definitely a head-scratcher for me.

Thanks again for the follow-up!

--John


#5

Just to follow-up, is there any xpack or kibana data that would be written out to anything other than a configured data directory, like the elasticsearch/config directory?


(David Turner) #6

This is something of a puzzle then. Let's focus on the .kibana index first. From Elasticsearch's point of view it's just another index, although if you're also running Kibana then it'll get auto-created if it vanishes, which might make diagnosis a bit trickier.

Can you share the logs from the restart of your cluster? I'm curious to see whether it looks like a restart of an existing cluster or like a new cluster starting up and auto-importing some data that it happened to find lying around. If they're large, use gist.github.com and paste a link here.


#7

Hi David,

Okay...here's another detail. I am running Kibana in it's own docker. Is it possible that I need to mount a directory for the Kibana docker? It appears the Kibana Docker container is starting up and thinks there is no .kibana index and recreates it. So...is there some flag other than Kibana seeing the .kibana index? I am looking at the official Elasticsearch KIbana documentation and I am not seeing any evidence to support my hypothesis.

I am gonna stop/start my cluster and watch the logs as you suggested and see if the .kibana index disappears at some point.

--John


(David Turner) #8

I'm afraid I'm not 100% familiar with Kibana, but I would expect its state to be stored in Elasticsearch as a general rule. It's not something silly like the Kibana container is also running its own Elasticsearch node, with ephemeral storage? This will tell us all some useful info about the .kibana index:

GET .kibana/_stats?level=shards&filter_path=indices.*.shards.*.shard_path,indices.*.shards.*.routing

Here, it says this:

{
  "indices": {
    ".kibana": {
      "shards": {
        "0": [
          {
            "shard_path": {
              "state_path": "/Users/davidturner/stack-6.3.0/elasticsearch-6.3.0/data/nodes/0",
              "data_path": "/Users/davidturner/stack-6.3.0/elasticsearch-6.3.0/data/nodes/0",
              "is_custom_data_path": false
            },
            "routing": {
              "state": "STARTED",
              "node": "9lL0_BbNRtSzW7IqOLypYQ",
              "primary": true,
              "relocating_node": null
            }
          }
        ]
      }
    }
  }
}

I.e. I have a single shard, no replicas, and there's its path on disk. More precisely, it's here:

$ ls -ald /Users/davidturner/stack-6.3.0/elasticsearch-6.3.0/data/nodes/0/indices/$(curl -s http://localhost:9200/_cat/indices/.kibana?h=uuid)
drwxr-xr-x  4 davidturner  staff  128  6 Jul 14:36 /Users/davidturner/stack-6.3.0/elasticsearch-6.3.0/data/nodes/0/indices/_cbVWr-kSo2kwGIppXv2tg

#9

Ah, interesting. So there's a state path and a data path for there are no replicas by default.

The Kibana container is not running it's own ES instance within the container. Definitely a another great question to ask, but the .kibana index is stored within the ES cluster.

I think this may be a race condition issue. If the Kibana container starts before the ES cluster has elected a leader and therefore is responding to queries, I wonder if Kibana concludes there is no .kibana index because it's a fresh install and therefore creates a new .kibana index. This is possible because I am deploying all of the Elasticsearch and Kibana containers on Marathon. Anytime the list of Elasticsearch containers changes in Marathon Kibana restarts. So...if I bring down the Elasticsearch cluster, Kibana immediately restarts. I wonder if I need to shutdown the Kibana container and THEN bring down the ES cluster. Does that make sense?

--John


(David Turner) #10

I think Elasticsearch should render this race benign: creating an index requires an elected leader, and fails (without deleting anything) if the index already exists. This is, however, not true if the cluster starts afresh and imports existing data, and for this it'd be useful to see logs.

Could you also confirm that the cluster UUID doesn't change across these restarts?

GET /?filter_path=cluster_uuid

It'd give us another data point, at least.


#11

Okay, just researched this on a couple of small clusters (< 5) and confirmed there is no race condition and all kibana indices are preserved. I will test the one large cluster we have early next week when it's back online and will report the results of that.

What I can definitely prove is that the xpack license information is lost when any size ES cluster composed of Docker container is restarted. Totally repeatable. Question--does the plugin metadata get saved in a non-ES index data store or directory?

Thanks again, by the way, for your time--much, much appreciated!

--John


(David Turner) #12

The licence is stored in the cluster state, and there are other things that suggest that cluster state is not being preserved across restarts. However, there might be other explanations, which is why it'd be useful for you to confirm whether the cluster UUID changes across restarts or not:

The metadata is stored within the Elasticsearch data path but not within any specific index. This shouldn't matter, but I wonder if perhaps you're mounting the persistent volume as a subfolder of the data path? Everything within the data path should be on the same filesystem and preserved across restarts, but if for example you're only preserving $DATA_PATH/nodes/0/indices then that could perhaps explain to the effects we're seeing here.


#13

Awesome information and questions! I am pretty sure that the data path I have configured is at the $DATA_PATH level you have specified here because I see nodes/0/indices in that directory. I will re-check now and will also confirm whether the cluster UUID changes.

--John


#14

Hi David,

Okay, a couple of things:

  1. cluster_uuid--that remains constant across multiple shutdown/restart cycles.
  2. DATA_PATH--I have the following Docker volume configuration on one of the hosts: /srv/data9/:/data1. The cluster name is es, so the path to the nodes directory on the host is /srv/data9/es/nodes.

--John


(David Turner) #15

Ok, I think we're going to need to see some logs to dig into this further.


#16

Cool, understood, I will get 'em on Monday. Thanks again for all of your help and have a great weekend!


#17

Looking at the logs there are no error messages related to the xpack plugin:

[2018-07-09T11:45:41,423][INFO][o.e.l.LicenseService ][] license [*] mode [basic] - valid

Is there anything else I should be looking for?

--John


(David Turner) #18

Could you share the logs? TBH I don't know exactly what I'm looking for yet, but hopefully there's something out of the ordinary there.


(system) #19

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.