Another missing data after restart topic

jerrac · January 29, 2015, 12:29am

I read through the past couple months worth of "missing data" topics on
this list, and didn't really find anything to help. So, apologies if I
missed something in them that is obvious.

Yesterday evening I ran OS updates to fix the glibc vulnerability. That
meant I restarted my 4 ES nodes. Today, I checked ElasticHQ, and a bunch of
this months logstash indices are missing tons of data. One day went from
over a million logs to 53.

When restarting, I did not bother to shut ES down cleanly. As far as I
know, though, that should not cause ES to delete data...

My data is stored on nfs mounts, so I double checked that they had mounted
correctly. And I went to the hassle of cleanly shutting them down, one at a
time, and making sure the mount points themselves were empty. A couple
weren't, but the data was from months ago. Not this month.

In my logs for the past couple days I don't see anything that really pops
out at me. Just that my curator generated snapshots are all failing for
some reason or another. I don't think that has anything to do with the
missing data, but here's an entry from one of my log files anyway.

[2015-01-26 03:00:10,145][WARN ][snapshots ]

[log-elasticsearch-04] [[logstash-2014.12.31][3]]
[laneprodelk:curator-20150126110002] failed to create snapshot
org.elasticsearch.index.snapshots.IndexShardSnapshotFailedException:
[logstash-2014.12.31][3] Failed to perform snapshot (index files)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshot(BlobStoreIndexShardRepository.java:503)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository.snapshot(BlobStoreIndexShardRepository.java:139)
at
org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.snapshot(IndexShardSnapshotAndRestoreService.java:86)
at
org.elasticsearch.snapshots.SnapshotsService$5.run(SnapshotsService.java:818)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException:
/mounts/prod_backup/laneprodelk/indices/logstash-2014.12.31/3/__0 (No such
file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.(FileOutputStream.java:221)
at java.io.FileOutputStream.(FileOutputStream.java:171)
at
org.elasticsearch.common.blobstore.fs.FsBlobContainer.createOutput(FsBlobContainer.java:85)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshotFile(BlobStoreIndexShardRepository.java:555)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshot(BlobStoreIndexShardRepository.java:501)
... 6 more

ES version info:

{
"status" : 200,
"name" : "log-elasticsearch-02",
"cluster_name" : "logstash-webservices",
"version" : {
"number" : "1.4.2",
"build_hash" : "927caff6f05403e936c20bf4529f144f0c89fd8c",
"build_timestamp" : "2014-12-16T14:11:12Z",
"build_snapshot" : false,
"lucene_version" : "4.10.2"
},
"tagline" : "You Know, for Search"
}

I'm running the nodes on Ubuntu 12.04 64bit.

I do have curator deleting data older than 365 days every night. It also
closes indices older than 90 days, and optimizes them if they're older than
2 days.

curator is version 2.0.1.

So, can you help me figure this out?

What could cause data loss?

What should I look for in the logs?

Which logs should I look at? Just elasticsearch.log, and indexname.log.date?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/21f6a633-cf3b-4861-95d5-827116a82c81%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jerrac · March 3, 2015, 8:06pm

Any suggestions? At all?

It's time to run upgrades again, but since last time I lost data, I'm
rather nervous about it.

And I'd really like to have everything up to date while I'm at Elasticon
next week...

On Wednesday, January 28, 2015 at 4:29:47 PM UTC-8, David Reagan wrote:

I read through the past couple months worth of "missing data" topics on
this list, and didn't really find anything to help. So, apologies if I
missed something in them that is obvious.

Yesterday evening I ran OS updates to fix the glibc vulnerability. That
meant I restarted my 4 ES nodes. Today, I checked ElasticHQ, and a bunch of
this months logstash indices are missing tons of data. One day went from
over a million logs to 53.

When restarting, I did not bother to shut ES down cleanly. As far as I
know, though, that should not cause ES to delete data...

My data is stored on nfs mounts, so I double checked that they had mounted
correctly. And I went to the hassle of cleanly shutting them down, one at a
time, and making sure the mount points themselves were empty. A couple
weren't, but the data was from months ago. Not this month.

In my logs for the past couple days I don't see anything that really pops
out at me. Just that my curator generated snapshots are all failing for
some reason or another. I don't think that has anything to do with the
missing data, but here's an entry from one of my log files anyway.

[2015-01-26 03:00:10,145][WARN ][snapshots ]

[log-elasticsearch-04] [[logstash-2014.12.31][3]]
[laneprodelk:curator-20150126110002] failed to create snapshot
org.elasticsearch.index.snapshots.IndexShardSnapshotFailedException:
[logstash-2014.12.31][3] Failed to perform snapshot (index files)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshot(BlobStoreIndexShardRepository.java:503)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository.snapshot(BlobStoreIndexShardRepository.java:139)
at
org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.snapshot(IndexShardSnapshotAndRestoreService.java:86)
at
org.elasticsearch.snapshots.SnapshotsService$5.run(SnapshotsService.java:818)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException:
/mounts/prod_backup/laneprodelk/indices/logstash-2014.12.31/3/__0 (No such
file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.(FileOutputStream.java:221)
at java.io.FileOutputStream.(FileOutputStream.java:171)
at
org.elasticsearch.common.blobstore.fs.FsBlobContainer.createOutput(FsBlobContainer.java:85)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshotFile(BlobStoreIndexShardRepository.java:555)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshot(BlobStoreIndexShardRepository.java:501)
... 6 more

ES version info:

{
"status" : 200,
"name" : "log-elasticsearch-02",
"cluster_name" : "logstash-webservices",
"version" : {
"number" : "1.4.2",
"build_hash" : "927caff6f05403e936c20bf4529f144f0c89fd8c",
"build_timestamp" : "2014-12-16T14:11:12Z",
"build_snapshot" : false,
"lucene_version" : "4.10.2"
},
"tagline" : "You Know, for Search"
}

I'm running the nodes on Ubuntu 12.04 64bit.

I do have curator deleting data older than 365 days every night. It also
closes indices older than 90 days, and optimizes them if they're older than
2 days.

curator is version 2.0.1.

So, can you help me figure this out?

What could cause data loss?

What should I look for in the logs?

Which logs should I look at? Just elasticsearch.log, and
indexname.log.date?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f7846534-f700-452a-8532-dfa43532401b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dennis · March 4, 2015, 5:22am

Somewhre I've read that NFS is not recommended for ElasticSEarch:

(see 'storage devices) section.

I know, it does't help you know, sorry.

On Tuesday, March 3, 2015 at 12:06:27 PM UTC-8, David Reagan wrote:

Any suggestions? At all?

It's time to run upgrades again, but since last time I lost data, I'm
rather nervous about it.

And I'd really like to have everything up to date while I'm at Elasticon
next week...

On Wednesday, January 28, 2015 at 4:29:47 PM UTC-8, David Reagan wrote:

I read through the past couple months worth of "missing data" topics on
this list, and didn't really find anything to help. So, apologies if I
missed something in them that is obvious.

Yesterday evening I ran OS updates to fix the glibc vulnerability. That
meant I restarted my 4 ES nodes. Today, I checked ElasticHQ, and a bunch of
this months logstash indices are missing tons of data. One day went from
over a million logs to 53.

When restarting, I did not bother to shut ES down cleanly. As far as I
know, though, that should not cause ES to delete data...

My data is stored on nfs mounts, so I double checked that they had
mounted correctly. And I went to the hassle of cleanly shutting them down,
one at a time, and making sure the mount points themselves were empty. A
couple weren't, but the data was from months ago. Not this month.

In my logs for the past couple days I don't see anything that really pops
out at me. Just that my curator generated snapshots are all failing for
some reason or another. I don't think that has anything to do with the
missing data, but here's an entry from one of my log files anyway.

[2015-01-26 03:00:10,145][WARN ][snapshots ]

[log-elasticsearch-04] [[logstash-2014.12.31][3]]
[laneprodelk:curator-20150126110002] failed to create snapshot
org.elasticsearch.index.snapshots.IndexShardSnapshotFailedException:
[logstash-2014.12.31][3] Failed to perform snapshot (index files)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshot(BlobStoreIndexShardRepository.java:503)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository.snapshot(BlobStoreIndexShardRepository.java:139)
at
org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.snapshot(IndexShardSnapshotAndRestoreService.java:86)
at
org.elasticsearch.snapshots.SnapshotsService$5.run(SnapshotsService.java:818)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException:
/mounts/prod_backup/laneprodelk/indices/logstash-2014.12.31/3/__0 (No such
file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.(FileOutputStream.java:221)
at java.io.FileOutputStream.(FileOutputStream.java:171)
at
org.elasticsearch.common.blobstore.fs.FsBlobContainer.createOutput(FsBlobContainer.java:85)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshotFile(BlobStoreIndexShardRepository.java:555)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshot(BlobStoreIndexShardRepository.java:501)
... 6 more

ES version info:

{
"status" : 200,
"name" : "log-elasticsearch-02",
"cluster_name" : "logstash-webservices",
"version" : {
"number" : "1.4.2",
"build_hash" : "927caff6f05403e936c20bf4529f144f0c89fd8c",
"build_timestamp" : "2014-12-16T14:11:12Z",
"build_snapshot" : false,
"lucene_version" : "4.10.2"
},
"tagline" : "You Know, for Search"
}

I'm running the nodes on Ubuntu 12.04 64bit.

I do have curator deleting data older than 365 days every night. It also
closes indices older than 90 days, and optimizes them if they're older than
2 days.

curator is version 2.0.1.

So, can you help me figure this out?

What could cause data loss?

What should I look for in the logs?

Which logs should I look at? Just elasticsearch.log, and
indexname.log.date?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d31d1295-8aa8-443c-9847-f863ec32cdc2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Strange problem: my ES server almost lost all its data. (All shards failed) Elasticsearch	5	454	July 6, 2017
Data missing after ES restart Elasticsearch	27	16523	July 5, 2017
ES Ate My Shards/Indexes too Elasticsearch	9	430	July 6, 2017
Deleted cluster,cant restore index from non-ES snapshot (disk backup) Elasticsearch	9	1361	July 5, 2017
Am I losing data? Elasticsearch	5	330	April 27, 2022

Another missing data after restart topic

Related topics