Unnable to list snapshots in Elasticsearch repository

What is the output of:

GET /
GET /_cat/nodes?v
GET /_cat/health?v

This time, do not post screenshots but just the output as you did previously.

Is there any chance that you registered the same repository on multiple clusters using different versions of the cluster?

I think we want to see the stack trace to better understand what exactly is failing:

curl -X GET "10.0.1.159:9200/_cat/snapshots/temp_elastic_backup?error_trace"
1 Like

No, version was the same
here is curls

curl -X GET "localhost:9200/"
{
  "name" : "loges-prod-15-uv03.",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "JQASlHJoQlO737P9hPV6cg",
  "version" : {
    "number" : "7.4.1",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "fc0eeb6e2c25915d63d871d344e3d0b45ea0ea1e",
    "build_date" : "2019-10-22T17:16:35.176724Z",
    "build_snapshot" : false,
    "lucene_version" : "8.2.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

/cat/nodes

curl localhost:9200/_cat/nodes
127.0.0.1 24 15 0 0.00 0.01 0.05 dilm * loges-prod-15-uv03.

/health/

curl -X GET "localhost:9200/_cluster/health?wait_for_status=yellow&timeout=50s&pretty"
{
  "cluster_name" : "elasticsearch",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Here is output

curl -X GET "localhost:9200/_cat/snapshots/temp_elastic_backup?error_trace"

{"error":{"root_cause":[{"type":"parse_exception","reason":"start object expected","stack_trace":"ElasticsearchParseException[start object expected]\n\tat org.elasticsearch.repositories.RepositoryData.snapshotsFromXContent(RepositoryData.java:431)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:790)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:769)\n\tat org.elasticsearch.snapshots.SnapshotsService.getRepositoryData(SnapshotsService.java:157)\n\tat org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:99)\n\tat org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:56)\n\tat org.elasticsearch.action.support.master.TransportMasterNodeAction.masterOperation(TransportMasterNodeAction.java:94)\n\tat org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction.lambda$doStart$3(TransportMasterNodeAction.java:165)\n\tat org.elasticsearch.action.ActionRunnable$1.doRun(ActionRunnable.java:45)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:773)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:830)\n"}],"type":"parse_exception","reason":"start object expected","stack_trace":"ElasticsearchParseException[start object expected]\n\tat org.elasticsearch.repositories.RepositoryData.snapshotsFromXContent(RepositoryData.java:431)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:790)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:769)\n\tat org.elasticsearch.snapshots.SnapshotsService.getRepositoryData(SnapshotsService.java:157)\n\tat org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:99)\n\tat org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:56)\n\tat org.elasticsearch.action.support.master.TransportMasterNodeAction.masterOperation(TransportMasterNodeAction.java:94)\n\tat org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction.lambda$doStart$3(TransportMasterNodeAction.java:165)\n\tat org.elasticsearch.action.ActionRunnable$1.doRun(ActionRunnable.java:45)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:773)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:830)\n"},"status":400}
1 Like

So you have only onde single node for this cluster and absolutely no data in it?

Thanks. What version is this from?


edit: 7.4.1 it seems, I thought you had upgraded to 7.12.1 but that stack trace doesn't make sense there:

ElasticsearchParseException[start object expected]
        at org.elasticsearch.repositories.RepositoryData.snapshotsFromXContent(RepositoryData.java:431)
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:790)
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:769)
        at org.elasticsearch.snapshots.SnapshotsService.getRepositoryData(SnapshotsService.java:157)
        at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:99)
        at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:56)
        at org.elasticsearch.action.support.master.TransportMasterNodeAction.masterOperation(TransportMasterNodeAction.java:94)
        at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction.lambda$doStart$3(TransportMasterNodeAction.java:165)
        at org.elasticsearch.action.ActionRunnable$1.doRun(ActionRunnable.java:45)
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:773)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:830)

Your repository contents appear to be corrupted: the top-level metadata file does not contain a JSON object (it's probably empty). There have been many changes in how Elasticsearch interacts with the repository since 7.4 (which is past EOL) so I'm not sure if this might be explained by a bug that's since been fixed, or whether it's due to some problem outside of Elasticsearch. I'm pretty sure it's impossible for recent versions to get into this state anyway. I suggest you upgrade to 7.12 and start again with a fresh repository.

no this fresh vm which connects to the repo

I can upgrade my node and read my repo again with trace

and this repository is important to me, because this backup our production system logs, they rotate to NFS share for Archieve format...

Ok, do you have a repository backup e.g. on tape?

yeah, but we do this not every day ;(

Here's output from the latest version
looks the same...

curl -X GET "localhost:9200/_cat/snapshots/temp_elastic_backup?error_trace"

{"error":{"root_cause":[{"type":"parsing_exception","reason":"Failed to parse object: expecting token of type [START_OBJECT] but found [null]","line":1,"col":0,"stack_trace":"ParsingException[Failed to parse object: expecting token of type [START_OBJECT] but found [null]]\n\tat org.elasticsearch.common.xcontent.XContentParserUtils.parsingException(XContentParserUtils.java:71)\n\tat org.elasticsearch.common.xcontent.XContentParserUtils.ensureExpectedToken(XContentParserUtils.java:65)\n\tat org.elasticsearch.repositories.RepositoryData.snapshotsFromXContent(RepositoryData.java:708)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:1706)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.doGetRepositoryData(BlobStoreRepository.java:1528)\n\tat org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:831)\n"}],"type":"repository_exception","reason":"[temp_elastic_backup] Unexpected exception when loading repository data","caused_by":{"type":"parsing_exception","reason":"Failed to parse object: expecting token of type [START_OBJECT] but found [null]","line":1,"col":0,"stack_trace":"ParsingException[Failed to parse object: expecting token of type [START_OBJECT] but found [null]]\n\tat org.elasticsearch.common.xcontent.XContentParserUtils.parsingException(XContentParserUtils.java:71)\n\tat org.elasticsearch.common.xcontent.XContentParserUtils.ensureExpectedToken(XContentParserUtils.java:65)\n\tat org.elasticsearch.repositories.RepositoryData.snapshotsFromXContent(RepositoryData.java:708)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:1706)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.doGetRepositoryData(BlobStoreRepository.java:1528)\n\tat org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:831)\n"},"stack_trace":"RepositoryException[[temp_elastic_backup] Unexpected exception when loading repository data]; nested: ParsingException[Failed to parse object: expecting token of type [START_OBJECT] but found [null]];\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.doGetRepositoryData(BlobStoreRepository.java:1578)\n\tat org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:831)\nCaused by: ParsingException[Failed to parse object: expecting token of type [START_OBJECT] but found [null]]\n\tat org.elasticsearch.common.xcontent.XContentParserUtils.parsingException(XContentParserUtils.java:71)\n\tat org.elasticsearch.common.xcontent.XContentParserUtils.ensureExpectedToken(XContentParserUtils.java:65)\n\tat org.elasticsearch.repositories.RepositoryData.snapshotsFromXContent(RepositoryData.java:708)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:1706)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.doGetRepositoryData(BlobStoreRepository.java:1528)\n\t... 6 more\n"},"status":500}0
RepositoryException[[temp_elastic_backup] Unexpected exception when loading repository data]; nested: ParsingException[Failed to parse object: expecting token of type [START_OBJECT] but found [null]];
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.doGetRepositoryData(BlobStoreRepository.java:1578)
        at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
        at java.base/java.lang.Thread.run(Thread.java:831)
Caused by: ParsingException[Failed to parse object: expecting token of type [START_OBJECT] but found [null]]
        at org.elasticsearch.common.xcontent.XContentParserUtils.parsingException(XContentParserUtils.java:71)
        at org.elasticsearch.common.xcontent.XContentParserUtils.ensureExpectedToken(XContentParserUtils.java:65)
        at org.elasticsearch.repositories.RepositoryData.snapshotsFromXContent(RepositoryData.java:708)
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:1706)
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.doGetRepositoryData(BlobStoreRepository.java:1528)
        ... 6 more

Yes, same problem, just a bit more detail: expecting token of type [START_OBJECT] but found [null].

so you suggest this is problem on NFS layer (FS, Disk etc)?

I haven't seen the repository contents, but assuming this file is empty I can't see a way for that to have happened without a storage problem (e.g. an fsync() was ignored or a rename was not atomic).

i can send you my repository contents or any specific filesystem info is needed
thank you so much for you

I sent you a private message with a link.

Just closing the loop on this one, we investigated further in another channel and determined that there was some bad problem somewhere on the NFS side. The repository listing contained multiple zero-length files:

-rw-r--r-- 1 elasticsearch elasticsearch 0 Apr 21 07:27 index-66
-rw-r--r-- 1 elasticsearch elasticsearch 0 Apr 21 07:27 index.latest
...
-rw-r--r-- 1 elasticsearch elasticsearch 0 Apr 21 07:27 snap--blahblahblah.dat

Elasticsearch writes these files one at a time, and stops on failure, so the only explanation is that the repository is reporting successful writes and then losing the data that was written.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.