I've been indexing millions of docs in a ES cluster with 4 processes
that lanches about 10 threads each, and each one of those threads use
a Transport client for indexing.
For one day everything was fine, but today we started to experience
lots of slow queries to ES and found the following output in the logs
of almost every server (there are 6 ES servers):
[2012-01-12 00:20:38,510][WARN ][index.merge.scheduler ] [Phage]
[items][14] failed to merge
java.io.FileNotFoundException: _b7i6_1.del
at org.elasticsearch.index.store.Store
$StoreDirectory.fileLength(Store.java:378)
at org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:
303)
at org.apache.lucene.index.MergePolicy
$OneMerge.totalBytesSize(MergePolicy.java:174)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:
79)
at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:456)
What could be the reasons for this warning?
The cluster health shows everything ok so, I guess no information has
been lost, but since these are production servers, I'd rather know if
this can be a problem as soon as possible.
I've been indexing millions of docs in a ES cluster with 4 processes
that lanches about 10 threads each, and each one of those threads use
a Transport client for indexing.
For one day everything was fine, but today we started to experience
lots of slow queries to ES and found the following output in the logs
of almost every server (there are 6 ES servers):
[2012-01-12 00:20:38,510][WARN ][index.merge.scheduler ] [Phage]
[items][14] failed to merge
java.io.FileNotFoundException: _b7i6_1.del
at org.elasticsearch.index.store.Store
$StoreDirectory.fileLength(Store.java:378)
at org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:
303)
at org.apache.lucene.index.MergePolicy
$OneMerge.totalBytesSize(MergePolicy.java:174)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:
79)
at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:456)
What could be the reasons for this warning?
The cluster health shows everything ok so, I guess no information has
been lost, but since these are production servers, I'd rather know if
this can be a problem as soon as possible.
Thanks for you reply and sorry for the missing info:
I'm using ES 0.18.5
AFAIK nobody should have deleted any file from those servers
In some servers, only the mentioned output has been logged so far
(last night it was logged another similar warning, but with a
different file name java.io.FileNotFoundException: _cqjh_2.del'').
In some other servers there are only DEBUG stacktraces like the
following
[2012-01-10 15:08:59,178][DEBUG][action.admin.cluster.node.info]
[James Howlett]
failed to execute on node [pfh3QO6ATo2A4wcEk2Cq0g]
org.elasticsearch.transport.RemoteTransportException:
[Gog][inet[/172.16.138.113:9300]][/cluster/nodes/info/node]
Caused by: java.lang.NullPointerException
at org.elasticsearch.http.HttpInfo.writeTo(HttpInfo.java:65)
I assumed this as expected as I've removed a server from the cluster,
whose IP was configured in the 'discovery.zen.ping.unicast.hosts'
list.
I'm planning to restart each one of the ES instances because some
other issues (too many open idle connections), could that possibly
recreate the missing files?
I've been indexing millions of docs in a ES cluster with 4 processes
that lanches about 10 threads each, and each one of those threads use
a Transport client for indexing.
For one day everything was fine, but today we started to experience
lots of slow queries to ES and found the following output in the logs
of almost every server (there are 6 ES servers):
[2012-01-12 00:20:38,510][WARN ][index.merge.scheduler ] [Phage]
[items][14] failed to merge
java.io.FileNotFoundException: _b7i6_1.del
at org.elasticsearch.index.store.Store
$StoreDirectory.fileLength(Store.java:378)
at org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:
303)
at org.apache.lucene.index.MergePolicy
$OneMerge.totalBytesSize(MergePolicy.java:174)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:
79)
at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:456)
I've 'gisted' the cluster state herehttps://gist.github.com/1602222
What could be the reasons for this warning?
The cluster health shows everything ok so, I guess no information has
been lost, but since these are production servers, I'd rather know if
this can be a problem as soon as possible.
Were you able to resolve your issue? if yes, how?
im facing a similar issue and my queries are getting really slow. not sure
how to fix it.
thanks
On Thursday, January 12, 2012 10:33:05 AM UTC-8, Frederic wrote:
Hi guys,
I've been indexing millions of docs in a ES cluster with 4 processes
that lanches about 10 threads each, and each one of those threads use
a Transport client for indexing.
For one day everything was fine, but today we started to experience
lots of slow queries to ES and found the following output in the logs
of almost every server (there are 6 ES servers):
[2012-01-12 00:20:38,510][WARN ][index.merge.scheduler ] [Phage]
[items][14] failed to merge
java.io.FileNotFoundException: _b7i6_1.del
at org.elasticsearch.index.store.Store
$StoreDirectory.fileLength(Store.java:378)
at org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:
303)
at org.apache.lucene.index.MergePolicy
$OneMerge.totalBytesSize(MergePolicy.java:174)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:
at org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:456)
What could be the reasons for this warning?
The cluster health shows everything ok so, I guess no information has
been lost, but since these are production servers, I'd rather know if
this can be a problem as soon as possible.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.