MORE INFO:
I grepped only the 'WARN' messages.
MASTER Node(ES1) logs:
[2014-06-30 09:02:36,942][WARN ][index.engine.internal ] [NES1]
[logsjmeter14][2] failed engine [refresh failed]
[2014-06-30 09:02:37,715][WARN ][cluster.action.shard ] [NES1]
[logsjmeter14][2] sending failed shard for [logsjmeter14][2],
node[dbPhRQoQQE-Tlgict_gfeg], [P], s[STARTED], indexUUID
[lXE8Wre0S3KxjGs9Jov1tw], reason [engine failure, message [refresh
failed][CorruptIndexException[codec header mismatch: actual header=0 vs
expected header=1071082519 (resource:
BufferedChecksumIndexInput(SlicedIndexInput(SlicedIndexInput(_1a_es090_0.blm
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter14/2/index/_1a.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter14/2/index/_1a.cfs")
slice=29488:29662)))]]]
[2014-06-30 09:02:37,717][WARN ][cluster.action.shard ] [NES1]
[logsjmeter14][2] received shard failed for [logsjmeter14][2],
node[dbPhRQoQQE-Tlgict_gfeg], [P], s[STARTED], indexUUID
[lXE8Wre0S3KxjGs9Jov1tw], reason [engine failure, message [refresh
failed][CorruptIndexException[codec header mismatch: actual header=0 vs
expected header=1071082519 (resource:
BufferedChecksumIndexInput(SlicedIndexInput(SlicedIndexInput(_1a_es090_0.blm
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter14/2/index/_1a.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter14/2/index/_1a.cfs")
slice=29488:29662)))]]]
[2014-06-30 09:03:14,809][WARN ][cluster.action.shard ] [NES1]
[logsjmeter87][4] received shard failed for [logsjmeter87][4],
node[XVuxg7fzTT-Xy-ArWiapJQ], [R], s[STARTED], indexUUID
[leU8sfPETKCeQFvntNY9sg], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual
codec=Lucene45DocValuesData vs expected codec=Lucene45ValuesMetadata
(resource:
BufferedChecksumIndexInput(SlicedIndexInput(SlicedIndexInput(_12_Lucene45_0.dvm
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter87/4/index/_12.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter87/4/index/_12.cfs")
slice=15224:15300)))]]]
[2014-06-30 09:03:24,021][WARN ][index.engine.internal ] [NES1]
[logsjmeter65][1] failed engine [refresh failed]
[2014-06-30 09:03:24,371][WARN ][cluster.action.shard ] [NES1]
[logsjmeter65][1] sending failed shard for [logsjmeter65][1],
node[dbPhRQoQQE-Tlgict_gfeg], [P], s[STARTED], indexUUID
[WXUHlSGVQ-GPGSKg0oWPIw], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual codec=XBloomFilter vs
expected codec=Lucene41NormsMetadata (resource:
BufferedChecksumIndexInput(SlicedIndexInput(SlicedIndexInput(_1b.nvm in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter65/1/index/_1b.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter65/1/index/_1b.cfs")
slice=15048:15209)))]]]
[2014-06-30 09:03:24,371][WARN ][cluster.action.shard ] [NES1]
[logsjmeter65][1] received shard failed for [logsjmeter65][1],
node[dbPhRQoQQE-Tlgict_gfeg], [P], s[STARTED], indexUUID
[WXUHlSGVQ-GPGSKg0oWPIw], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual codec=XBloomFilter vs
expected codec=Lucene41NormsMetadata (resource:
BufferedChecksumIndexInput(SlicedIndexInput(SlicedIndexInput(_1b.nvm in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter65/1/index/_1b.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter65/1/index/_1b.cfs")
slice=15048:15209)))]]]
[2014-06-30 09:03:31,778][WARN ][index.engine.internal ] [NES1]
[logsjmeter79][0] failed engine [refresh failed]
[2014-06-30 09:03:32,084][WARN ][cluster.action.shard ] [NES1]
[logsjmeter79][0] sending failed shard for [logsjmeter79][0],
node[dbPhRQoQQE-Tlgict_gfeg], [R], s[STARTED], indexUUID
[NZgUPNQnT0Ss0Lhk9PUz1w], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual
codec=BLOCK_TREE_TERMS_INDEX vs expected codec=CompoundFileWriterEntries
(resource:
BufferedChecksumIndexInput(MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter79/0/index/_z.cfe")))]]]
[2014-06-30 09:03:32,086][WARN ][cluster.action.shard ] [NES1]
[logsjmeter79][0] received shard failed for [logsjmeter79][0],
node[dbPhRQoQQE-Tlgict_gfeg], [R], s[STARTED], indexUUID
[NZgUPNQnT0Ss0Lhk9PUz1w], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual
codec=BLOCK_TREE_TERMS_INDEX vs expected codec=CompoundFileWriterEntries
(resource:
BufferedChecksumIndexInput(MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter79/0/index/_z.cfe")))]]]
[2014-06-30 09:03:33,865][WARN ][monitor.jvm ] [NES1]
[gc][young][228848][7461] duration [1.7s], collections [1]/[2s], total
[1.7s]/[4.4m], memory [3gb]->[2.8gb]/[3.9gb], all_pools {[young]
[168.5mb]->[30.6mb]/[266.2mb]}{[survivor]
[27.8mb]->[29.2mb]/[33.2mb]}{[old] [2.8gb]->[2.8gb]/[3.6gb]}
[2014-06-30 09:03:57,762][WARN ][cluster.action.shard ] [NES1]
[logsjmeter39][1] received shard failed for [logsjmeter39][1],
node[Jjvt3FxwSLWpSCHeIjOedQ], [P], s[STARTED], indexUUID
[_PGn7TPETEWllqz71M2ZBA], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual
codec=Lucene46FieldInfos vs expected codec=Lucene41PostingsWriterPos
(resource: SlicedIndexInput(SlicedIndexInput(_1j_es090_0.pos in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter39/1/index/_1j.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter39/1/index/_1j.cfs")
slice=17707:22401))]]]
ES2 logs:
[2014-06-30 09:03:14,785][WARN ][cluster.action.shard ] [NES2]
[logsjmeter87][4] sending failed shard for [logsjmeter87][4],
node[XVuxg7fzTT-Xy-ArWiapJQ], [R], s[STARTED], indexUUID
[leU8sfPETKCeQFvntNY9sg], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual
codec=Lucene45DocValuesData vs expected codec=Lucene45ValuesMetadata
(resource:
BufferedChecksumIndexInput(SlicedIndexInput(SlicedIndexInput(_12_Lucene45_0.dvm
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter87/4/index/_12.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter87/4/index/_12.cfs")
slice=15224:15300)))]]]
ES3 logs:
[2014-06-30 09:03:57,639][WARN ][cluster.action.shard ] [NES3]
[logsjmeter39][1] sending failed shard for [logsjmeter39][1],
node[Jjvt3FxwSLWpSCHeIjOedQ], [P], s[STARTED], indexUUID
[_PGn7TPETEWllqz71M2ZBA], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual
codec=Lucene46FieldInfos vs expected codec=Lucene41PostingsWriterPos
(resource: SlicedIndexInput(SlicedIndexInput(_1j_es090_0.pos in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter39/1/index/_1j.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter39/1/index/_1j.cfs")
slice=17707:22401))]]]
Thanks and Regards
Sri
On Monday, June 30, 2014 9:07:37 AM UTC-4, sri wrote:
Hi Simon,
i am currently using elasticsearch 1.2.1, i am getting the error on all my
data nodes, below are the errors:
[2014-06-30 09:03:57,762][WARN ][cluster.action.shard ] [NES1]
[logsjmeter39][1] received shard failed for [logsjmeter39][1],
node[Jjvt3FxwSLWpSCHeIjOedQ], [P], s[STARTED], indexUUID
[_PGn7TPETEWllqz71M2ZBA], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual
codec=Lucene46FieldInfos vs expected codec=Lucene41PostingsWriterPos
(resource: SlicedIndexInput(SlicedIndexInput(_1j_es090_0.pos in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter39/1/index/_1j.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter39/1/index/_1j.cfs")
slice=17707:22401))]]]
[2014-06-30 09:03:14,785][WARN ][cluster.action.shard ] [NES2]
[logsjmeter87][4] sending failed shard for [logsjmeter87][4],
node[XVuxg7fzTT-Xy-ArWiapJQ], [R], s[STARTED], indexUUID
[leU8sfPETKCeQFvntNY9sg], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual
codec=Lucene45DocValuesData vs expected codec=Lucene45ValuesMetadata
(resource:
BufferedChecksumIndexInput(SlicedIndexInput(SlicedIndexInput(_12_Lucene45_0.dvm
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter87/4/index/_12.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter87/4/index/_12.cfs")
slice=15224:15300)))]]]
[2014-06-30 09:03:57,639][WARN ][cluster.action.shard ] [NES3]
[logsjmeter39][1] sending failed shard for [logsjmeter39][1],
node[Jjvt3FxwSLWpSCHeIjOedQ], [P], s[STARTED], indexUUID
[_PGn7TPETEWllqz71M2ZBA], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual
codec=Lucene46FieldInfos vs expected codec=Lucene41PostingsWriterPos
(resource: SlicedIndexInput(SlicedIndexInput(_1j_es090_0.pos in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter39/1/index/_1j.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter39/1/index/_1j.cfs")
slice=17707:22401))]]]
Thanks and Regards
Sri
On Monday, June 30, 2014 4:00:23 AM UTC-4, simonw wrote:
hey,
thanks for raising this, can you gimme more infos ie. which version you
are using and if that happens only on one shard or on all shards in your
system? It could just be what it says, and index corruption maybe due to HW
failure but there could be other reasons....
simon
On Friday, June 27, 2014 5:20:26 PM UTC+2, sri wrote:
Hi
I am getting the below error my ES cluster quite frequently but am not
able to understand the actual reason as to why its happening.
[2014-06-27 11:12:50,014][WARN ][cluster.action.shard ] [NES1]
[logsjmeter62][0] received shard failed for [logsjmeter62][0],
node[ZqO9OQ8VQ0uGkvXdIeovRg], [P], s[STARTED], indexUUID
[EfBgCRm8SWu4AtsNPYVXyA], reason [engine failure, message [refresh
failed][CorruptIndexException[codec mismatch: actual
codec=Lucene41PostingsWriterDoc vs expected codec=Lucene46FieldInfos
(resource:
BufferedChecksumIndexInput(SlicedIndexInput(SlicedIndexInput(_39.fnm in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter62/0/index/_39.cfs"))
in
MMapIndexInput(path="/data/es/NESClus/nodes/0/indices/logsjmeter62/0/index/_39.cfs")
zlice=7371:8755)))]]]
Thanks and Regards
Sri
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/16f837d2-cf1c-4c5b-ae05-60b5f2698f72%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.