Strange errors while indexing documents


(Srinath C) #1

Hi,

We've been observing some strange exceptions in our cluster. Some of the instances are throwing up exceptions like the ones below. We are running a 6 node cluster with ES 1.3.0. Data is being pushed to ES via a storm cluster.

Any clues as to what these exceptions mean and how to overcome?

[2015-07-03 12:24:19,611][WARN ][index.store ] [metrics-datastore-1-QA2906-perf] [qaautomation3-4-1250537325][1] Can't open file to read checksums
java.io.FileNotFoundException: No such file [_pmi.fdt]
at org.elasticsearch.index.store.DistributorDirectory.getDirectory(DistributorDirectory.java:173)
at org.elasticsearch.index.store.DistributorDirectory.getDirectory(DistributorDirectory.java:144)
at org.elasticsearch.index.store.DistributorDirectory.openInput(DistributorDirectory.java:130)
at org.elasticsearch.index.store.Store$MetadataSnapshot.checksumFromLuceneFile(Store.java:532)
at org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:459)
at org.elasticsearch.index.store.Store$MetadataSnapshot.(Store.java:433)
at org.elasticsearch.index.store.Store.getMetadata(Store.java:144)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyInitializingShard(IndicesClusterStateService.java:724)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewOrUpdatedShards(IndicesClusterStateService.java:576)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:183)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:444)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


(Nik Everett) #2

Looks like something nuked that file directory out from under the elasticsearch process.


(Srinath C) #3

These are production systems with very limited access. Any other reasons?
Its probably worth mentioning - its a cluster of 6 c3.xlarge instances with data being stored on EBS with 1000 provisioned IOPs.


(Srinath C) #4

This issue has been reproduced repeatedly. Would appreciate any kind of help or pointers.


(Nik Everett) #5

Its beyond me, sorry!


(Srinath C) #6

Anyone from elastic team? How can I get any inputs on this?


(system) #7