Index UUIDs in snapshots

Hello,

we upgraded to ES 5 and we are trying to check the disk usage of our snapshots by index. The new scheme of using index UUIDs as folder names makes it impossible for us to know what folder corresponds to what index.

For indices that are still on ES we can get an index's UUID with the _cat/indices API but for indices that have been deleted on the cluster we cannot find that info and the snapshot API does not provide it.

  ...
  {
    "snapshot": "XXX-snapshot-20170219020644",
    "uuid": "XXXXXXXXXXXXXXXXXXX",
    "version_id": 5010299,
    "version": "5.1.2",
    "indices": [
      "XXXXXX-2017.02.18",
      "XXXXXX-2017.02.17"
    ],
    "state": "SUCCESS",
    "start_time": "2017-02-19T02:06:44.710Z",
    "start_time_in_millis": 1487470004710,
    "end_time": "2017-02-19T02:09:06.938Z",
    "end_time_in_millis": 1487470146938,
    "duration_in_millis": 142228,
    "failures": [],
    "shards": {
      "total": 4,
      "failed": 0,
      "successful": 4
    }
  },
  ...

In that API that the snapshot UUID is provided but not the index UUID.

If there was some API to return the snapshot disk usage by index we'd be ok with that too of course.

Note that we're using S3-backed snapshot repositories.

It's not actually the index uuid that is used, see the comment here: https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/repositories/IndexId.java#L64-L72

I'm not sure whether we should expose the location of the actual files in our APIs or whether that should remain an implementation detail. If you think that this is an important feature to have, please open an issue on Elasticsearch's Github repository and explicitly mark it as a feature request.

Haha nice trap with that ID :smiley:

As I said above my goal is not really to find out the paths but to see how much disk space each index takes up on the repository. I was able to do that back on 2.0 when paths where named after indices but not anymore.

You think I should open a feature request for getting the snapshoted index sizes on the API? Maybe that's available already somewhere else and I missed it?

I could not find any feature request like that. Maybe it makes sense to formulate the feature request more generally, for example to have an index - level view on snapshots instead of a snapshot id - based view, i.e., have an API where you can ask what the snapshots are that contain index XYZ. The API could then return size of files referenced by each snapshot and also total size.

Thanks m8, opened up the feature request: https://github.com/elastic/elasticsearch/issues/23479

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.