Conundrum with repositories

Russell_Fulton · February 23, 2021, 7:28pm

when I query my repositories using the API I get a list of the repositories as expected:

http://localhost:9200/_snapshot/*?pretty

{
  "version6" : {
    "type" : "fs",
    "settings" : {
      "location" : "/data/elasticsearch/backups/version6"
    }
  },
  "version7" : {
    "type" : "fs",
    "settings" : {
      "compress" : "false",
      "location" : "/data/elasticsearch/backups/version7"
    }
  },
  "daily" : {
    "type" : "fs",
    "settings" : {
      "compress" : "true",
      "location" : "/data/elasticsearch/backups/daily"
    }
  }
}

but when I try and access the repository ES says it does not exist:

http://localhost:9200/_cat/snapshots/daily?pretty

{
  "error" : {
    "root_cause" : [
      {
        "type" : "repository_missing_exception",
        "reason" : "[daily] missing"
      }
    ],
    "type" : "repository_missing_exception",
    "reason" : "[daily] missing"
  },
  "status" : 404
}

this is the same for all the listed repositories.

The repositories are shared filesystem and I recently moved everything to a different system with a lot more disk space. I verified that all looked OK and then created the daily repository and did a full backup which worked. A week later I find the situation above.

Any ideas what is going on?

Russell_Fulton · February 24, 2021, 4:28am

I had a light bulb moment.

It had to be the path.repo and when I looked at the configuration it was empty. Caused by an issue with Puppet configuration.

What confused me was that parts of the system seemed to know about the repos and other parts did not. If the repositories had completely vanished I am pretty sure I would have fingered the problem much sooner!

DavidTurner · February 24, 2021, 9:24am

Agreed that this isn't great; I would expect to see errors in the logs in this situation too.

The discrepancy is because GET _snapshot/* returns a list of repositories according to the cluster config, regardless of whether that config is ok or not, whereas GET _cat/snapshots/daily is trying to list the snapshots within the repository called daily, so it's closer to GET _snapshot/daily/*, i.e. the get snapshots API.

I think you can only get into this state if you start out with a valid config and then you move to a different system where it's not valid any more - because you can do this kind of move in a rolling fashion there's not really a point at which we can block your progress to let you know something is wrong.

Russell_Fulton · February 24, 2021, 7:44pm

what happened was that (due to my fat fingers puppet pushed a yml file with path.repos empty, there may well have been errors I missed in the logs but as we know it is really difficult to find meaningful stuff in the logs with all the stack traces. Sigh...

BTW is there a way of getting ES to load a new configuration file without restarting the node?

Puppet pushes out the changes and I used to restart the nodes from puppet but this happens at random intervals and I soon learnt this is a really bad idea and you can get the two master eligible nodes down at the same time : (

DavidTurner · February 25, 2021, 11:31am

As a general rule, any stack trace in the logs is meaningful - at least if it isn't then I'd count it as a bug (we have definitely fixed such bugs in the past).

No, sorry, a rolling restart is needed for changes to the config file. That's a deliberate feature for things like path.repo since that setting influences the security manager and we definitely don't want to be able to reconfigure the security manager at runtime.

Russell_Fulton · February 25, 2021, 6:08pm

Oh, I know that the stack traces are necessary for the diagnostics but they make the logs really hard to scan when you don't know what you are looking for! The "something went wrong is there anything unusual in the logs" scenario and you get one message on the screen at a time!

I must start piping the logs through cut for the first pass!

system · March 25, 2021, 6:08pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Snapshot 'repository_missing_exception' Elasticsearch	1	3208	August 28, 2017
Backup repository missing exception Elasticsearch	8	10543	August 2, 2018
Repository_missing_exception with existing repo Elasticsearch	1	7522	December 26, 2017
GET _snapshot/_status returns 404 "repository_missing_exception" even though the repository exists and works Elasticsearch snapshot-and-restore	4	781	June 23, 2021
Repository Elasticsearch	14	5404	July 5, 2017

Conundrum with repositories

Related topics