I read through the past couple months worth of "missing data" topics on
this list, and didn't really find anything to help. So, apologies if I
missed something in them that is obvious.
Yesterday evening I ran OS updates to fix the glibc vulnerability. That
meant I restarted my 4 ES nodes. Today, I checked ElasticHQ, and a bunch of
this months logstash indices are missing tons of data. One day went from
over a million logs to 53.
When restarting, I did not bother to shut ES down cleanly. As far as I
know, though, that should not cause ES to delete data...
My data is stored on nfs mounts, so I double checked that they had mounted
correctly. And I went to the hassle of cleanly shutting them down, one at a
time, and making sure the mount points themselves were empty. A couple
weren't, but the data was from months ago. Not this month.
In my logs for the past couple days I don't see anything that really pops
out at me. Just that my curator generated snapshots are all failing for
some reason or another. I don't think that has anything to do with the
missing data, but here's an entry from one of my log files anyway.
[2015-01-26 03:00:10,145][WARN ][snapshots ]
[log-elasticsearch-04] [[logstash-2014.12.31][3]]
[laneprodelk:curator-20150126110002] failed to create snapshot
org.elasticsearch.index.snapshots.IndexShardSnapshotFailedException:
[logstash-2014.12.31][3] Failed to perform snapshot (index files)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshot(BlobStoreIndexShardRepository.java:503)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository.snapshot(BlobStoreIndexShardRepository.java:139)
at
org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.snapshot(IndexShardSnapshotAndRestoreService.java:86)
at
org.elasticsearch.snapshots.SnapshotsService$5.run(SnapshotsService.java:818)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException:
/mounts/prod_backup/laneprodelk/indices/logstash-2014.12.31/3/__0 (No such
file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.(FileOutputStream.java:221)
at java.io.FileOutputStream.(FileOutputStream.java:171)
at
org.elasticsearch.common.blobstore.fs.FsBlobContainer.createOutput(FsBlobContainer.java:85)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshotFile(BlobStoreIndexShardRepository.java:555)
at
org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshot(BlobStoreIndexShardRepository.java:501)
... 6 more
ES version info:
{
"status" : 200,
"name" : "log-elasticsearch-02",
"cluster_name" : "logstash-webservices",
"version" : {
"number" : "1.4.2",
"build_hash" : "927caff6f05403e936c20bf4529f144f0c89fd8c",
"build_timestamp" : "2014-12-16T14:11:12Z",
"build_snapshot" : false,
"lucene_version" : "4.10.2"
},
"tagline" : "You Know, for Search"
}
I'm running the nodes on Ubuntu 12.04 64bit.
I do have curator deleting data older than 365 days every night. It also
closes indices older than 90 days, and optimizes them if they're older than
2 days.
curator is version 2.0.1.
So, can you help me figure this out?
What could cause data loss?
What should I look for in the logs?
Which logs should I look at? Just elasticsearch.log, and indexname.log.date?
Thanks!
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/21f6a633-cf3b-4861-95d5-827116a82c81%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.