Elasticsearch shard unassigned and changed to red

Hi
I am using logstash to ship my log data to es, the index is created in the format log-YYYY-mm-dd. It was running fine for 4-5 hours and then the shards failed, turned red, in log its written Engine failed[merge failed]... Corrupted index... and all

What is the real issue?

Even after failing I am able to search the es for few more hours. But cant write data into it at all. After few more hours the search also fails.

The log file is attached with the mail

Which version of Elasticsearch are you using? What is the specification of your cluster? What kind of storage are you using?

version 6 (latest)
Everything is default, i just reduced memory in jvm.options to 600mb, Kind of storage means?

I am running elasticsearch inside a docker container, logstash also runs inside the same container

Are you using local SSD or spinning disks, SAN, NFS or some other type of storage?

Actually that is unknown as elasticsearch run on a remote server. The storage details of that host is unknown. But how does that affect es?

That is a very small heap for running Elasticsearch, but should not cause corruption. The reason I am asking about the storage is that some types of storage do not work with Elasticsearch, and can cause corruption. See these posts for an example:

There is already another app which uses elasticsearch running successfully in the same host so the storage type is not a issue

Is that use case that app covering similar to yours? What is the full error message you are seeing?

Yes
I can't search elasticsearch.

It is red.

Request fails, that is the error

What is in the Elasticsearch logs?

Attachments are not processed in this forum. So you need to upload that to a 3rd party service like gist.github.com for example and add the link in your post.

[2018-06-21T02:50:22,523][INFO ][o.e.c.r.a.AllocationService] [RjaKnLJ] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[ptp-2018-06-21][2]] ...]).

[2018-06-21T04:29:54,733][WARN ][o.e.i.e.Engine ] [RjaKnLJ] [ptp-2018-06-21][2] failed engine [merge failed]

org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=10 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/home/app/data/elasticsearch/nodes/0/indices/wN2Gi9erR_uh4kLASV4SmA/2/index/_13m_Lucene70_0.dvm")))

at org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2111) [elasticsearch-6.2.4.jar:6.2.4]

at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) [elasticsearch-6.2.4.jar:6.2.4]

at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.4.jar:6.2.4]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]

Caused by: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=10 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/home/app/data/elasticsearch/nodes/0/indices/wN2Gi9erR_uh4kLASV4SmA/2/index/_13m_Lucene70_0.dvm")))

at org.apache.lucene.codecs.CodecUtil.validateFooter(CodecUtil.java:502) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:414) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.codecs.lucene50.Lucene50CompoundFormat.write(Lucene50CompoundFormat.java:103) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.index.IndexWriter.createCompoundFile(IndexWriter.java:5010) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4507) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4083) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:624) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:99) ~[elasticsearch-6.2.4.jar:6.2.4]

at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:661) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

[2018-06-21T04:29:54,848][WARN ][o.e.i.c.IndicesClusterStateService] [RjaKnLJ] [[ptp-2018-06-21][2]] marking and sending shard failed due to [shard failure, reason [merge failed]]

org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=10 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/home/app/data/elasticsearch/nodes/0/indices/wN2Gi9erR_uh4kLASV4SmA/2/index/_13m_Lucene70_0.dvm")))

at org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2111) ~[elasticsearch-6.2.4.jar:6.2.4]

at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) ~[elasticsearch-6.2.4.jar:6.2.4]

at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.2.4.jar:6.2.4]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]

Caused by: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=10 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/home/app/data/elasticsearch/nodes/0/indices/wN2Gi9erR_uh4kLASV4SmA/2/index/_13m_Lucene70_0.dvm")))

at org.apache.lucene.codecs.CodecUtil.validateFooter(CodecUtil.java:502) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:414) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.codecs.lucene50.Lucene50CompoundFormat.write(Lucene50CompoundFormat.java:103) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.index.IndexWriter.createCompoundFile(IndexWriter.java:5010) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4507) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4083) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:624) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:99) ~[elasticsearch-6.2.4.jar:6.2.4]

at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:661) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]

[2018-06-21T04:29:54,852][WARN ][o.e.c.a.s.ShardStateAction] [RjaKnLJ] [ptp-2018-06-21][2] received shard failed for shard id [[ptp-2018-06-21][2]], allocation id [omffE7HcSo25TTYLNbZuTA], primary term [0], message [shard failure, reason [merge failed]], failure [MergeException[org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=10 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/home/app/data/elasticsearch/nodes/0/indices/wN2Gi9erR_uh4kLASV4SmA/2/index/_13m_Lucene70_0.dvm")))]; nested: CorruptIndexException[codec footer mismatch (file truncated?): actual footer=10 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/home/app/data/elasticsearch/nodes/0/indices/wN2Gi9erR_uh4kLASV4SmA/2/index/_13m_Lucene70_0.dvm")))]; ]

org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=10 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/home/app/data/elasticsearch/nodes/0/indices/wN2Gi9erR_uh4kLASV4SmA/2/index/_13m_Lucene70_0.dvm")))

at org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2111) ~[elasticsearch-6.2.4.jar:6.2.4]

at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) ~[elasticsearch-6.2.4.jar:6.2.4]

at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.2.4.jar:6.2.4]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]

These are the logs

I am still interested to know what the underlying storage is. Is the other use case you mentioned also a write-intensive log analytics use case with similar characteristics?

Yes
Both are similar applications

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.