Shard index gone bad, anyone know how to fix this: java.io.EOFException: read past EOF: NIOFSIndexInput

Dimitry · January 4, 2013, 9:36am

We running version 0.19.9 with 6 servers running using 6 shards. A few
days ago, shard number 2 seems to have gone goofy (looks like
a corruption in the index) causing the following exception to appear
constantly in the server logs:

org.elasticsearch.transport.RemoteTransportException:
[cardano][inet[/xx.xxx.xx.xxx:9300]][search/phase/query]
Caused by: org.elasticsearch.search.query.QueryPhaseExecutionException:
[theindex][2]: query[filtered(+activityObject.content:"Some query term"
+sourceInfo.publisher:Some Name
-sourceInfo.dataSource:directPooling)->cache(_type:socialmedia)],from[0],size[1],sort[<custom:"sortDate":
org.elasticsearch.index.field.data.longs.LongFieldDataType$1@24f99e97>!]:
Query Failed [Failed to execute main query]

at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:182)*
at
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:234)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:497)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:486)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:268)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)*
*Caused by: java.io.EOFException: read past EOF:
NIOFSIndexInput(path="/var/data/elasticsearch/nodes/0/indices/theindex/2/index/_161lvl.tis")
at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:264)
at
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:40)
at org.apache.lucene.store.DataInput.readVInt(DataInput.java:107)*
at
org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:217)

What I've tried:

Changed replication to 0
Closed/Opened the index (to force rebalancing)
Restarted the node containing shard 2
with index.shard.check_on_startup: true

This seems to point to something going bad with the lucene index of this
shard, you the check_on_startup didn't seem to solve the problem. Anyone
know how to get around this.

Much appreciated.
Dimitry.

--

radu_gheorghe · January 5, 2013, 4:31pm

Hi Dimitri,

Read past EOF? I never got that. But here's my braindump, nevertheless:

If there's a problem with shard 2 on all nodes, I don't know what you can
do to recover the data, other than reindex or restore from backup.

If you get this issue only on one server, then I'd try something like this:

shut down the problematic node
reduce the number of replicas by 1
move /var/data/elasticsearch/nodes/0/indices/theindex/2/ to some backup
location
start the node again
increase the number of replicas back to force replication

I would assume that if this doesn't fix it it's either:

a problem with the hardware on that node. You can check memory&hdd, or
try to reproduce with another machine to confirm/deny
a problem with shard 2 on that index across all the nodes, in which case
you'd be back to reindexing/restoring from backup

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Fri, Jan 4, 2013 at 11:36 AM, Dimitry dlvovsky@gmail.com wrote:

We running version 0.19.9 with 6 servers running using 6 shards. A few
days ago, shard number 2 seems to have gone goofy (looks like
a corruption in the index) causing the following exception to appear
constantly in the server logs:

org.elasticsearch.transport.RemoteTransportException:
[cardano][inet[/xx.xxx.xx.xxx:9300]][search/phase/query]
Caused by: org.elasticsearch.search.query.QueryPhaseExecutionException:
[theindex][2]: query[filtered(+activityObject.content:"Some query term"
+sourceInfo.publisher:Some Name
-sourceInfo.dataSource:directPooling)->cache(_type:socialmedia)],from[0],size[1],sort[<custom:"sortDate":
org.elasticsearch.index.field.data.longs.LongFieldDataType$1@24f99e97>!]:
Query Failed [Failed to execute main query]

at
org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:182)*

at
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:234)

at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:497)

at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:486)

at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:268)

at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)*
*Caused by: java.io.EOFException: read past EOF:
NIOFSIndexInput(path="/var/data/elasticsearch/nodes/0/indices/theindex/2/index/_161lvl.tis")

at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:264)

at
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:40)

at org.apache.lucene.store.DataInput.readVInt(DataInput.java:107)*

at
org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:217)

What I've tried:

Changed replication to 0

Closed/Opened the index (to force rebalancing)

Restarted the node containing shard 2
with index.shard.check_on_startup: true

This seems to point to something going bad with the lucene index of this
shard, you the check_on_startup didn't seem to solve the problem. Anyone
know how to get around this.

Much appreciated.
Dimitry.

--

--

Dimitry · January 29, 2013, 11:46am

Running java -cp lucene-core-3.6.1.jar -ea:org.apache.lucene...
org.apache.lucene.index.CheckIndex
/home/es/data/production/nodes/0/indices/users/1/index/ -fix
did the trick as reported in this post
https://groups.google.com/forum/?fromgroups=#!topic/elasticsearch/xprRlA8RQ90
by Marcin Dojwa.

Thanks to all.

On Saturday, January 5, 2013 5:31:17 PM UTC+1, Radu Gheorghe wrote:

Hi Dimitri,

Read past EOF? I never got that. But here's my braindump, nevertheless:

If there's a problem with shard 2 on all nodes, I don't know what you can
do to recover the data, other than reindex or restore from backup.

If you get this issue only on one server, then I'd try something like this:

shut down the problematic node

reduce the number of replicas by 1

move /var/data/elasticsearch/nodes/0/indices/theindex/2/ to some backup
location

start the node again

increase the number of replicas back to force replication

I would assume that if this doesn't fix it it's either:

a problem with the hardware on that node. You can check memory&hdd, or
try to reproduce with another machine to confirm/deny

a problem with shard 2 on that index across all the nodes, in which case
you'd be back to reindexing/restoring from backup

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Fri, Jan 4, 2013 at 11:36 AM, Dimitry <dlvo...@gmail.com <javascript:>>wrote:

We running version 0.19.9 with 6 servers running using 6 shards. A few
days ago, shard number 2 seems to have gone goofy (looks like
a corruption in the index) causing the following exception to appear
constantly in the server logs:

org.elasticsearch.transport.RemoteTransportException:
[cardano][inet[/xx.xxx.xx.xxx:9300]][search/phase/query]
Caused by: org.elasticsearch.search.query.QueryPhaseExecutionException:
[theindex][2]: query[filtered(+activityObject.content:"Some query term"
+sourceInfo.publisher:Some Name
-sourceInfo.dataSource:directPooling)->cache(_type:socialmedia)],from[0],size[1],sort[<custom:"sortDate":
org.elasticsearch.index.field.data.longs.LongFieldDataType$1@24f99e97>!]:
Query Failed [Failed to execute main query]

at
org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:182)*

at
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:234)

at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:497)

at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:486)

at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:268)

at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)*
*Caused by: java.io.EOFException: read past EOF:
NIOFSIndexInput(path="/var/data/elasticsearch/nodes/0/indices/theindex/2/index/_161lvl.tis")

at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:264)

at
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:40)

at org.apache.lucene.store.DataInput.readVInt(DataInput.java:107)*

at
org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:217)

What I've tried:

Changed replication to 0

Closed/Opened the index (to force rebalancing)

Restarted the node containing shard 2
with index.shard.check_on_startup: true

This seems to point to something going bad with the lucene index of this
shard, you the check_on_startup didn't seem to solve the problem. Anyone
know how to get around this.

Much appreciated.
Dimitry.

--

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Searching gives java.io.IOException: read past EOF Elasticsearch	6	1486	July 6, 2017
IOException[Read past EOF] Elasticsearch	6	603	July 6, 2017
"java.io.EOFException: read past EOF: NIOFSIndexInput" error Elasticsearch	5	3152	July 6, 2017
"failed to merge java.io.EOFException: read past EOF: NIOFSIndexInput(" Elasticsearch	17	4016	July 6, 2017
Read past EOF exception on .tis and .fdt file Elasticsearch	3	1804	July 6, 2017

Shard index gone bad, anyone know how to fix this: java.io.EOFException: read past EOF: NIOFSIndexInput

Best regards, Radu

Best regards, Radu

Related topics

Best regards,
Radu

Best regards,
Radu