EOFException while reading SequenceFile to restore ES

Aurelien_3 · December 30, 2015, 12:25pm

Hi,

I've been trying to backup an ES cluster and restore it using Hadoop and the hadoop ES library. I'm using ES 1.7.1.

Because of the size of data and network speed, I did bucket the operation by date ranges. For some buckets, everything is working fine.

But for some buckets, I can't restore the data into ES. I get this error on the sequence files reading :

`
2015-12-30 11:43:31,202 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: hdfs://.../2014/7/part-m-00011:0+134217728
2015-12-30 11:43:35,435 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 300417020(1201668080)
2015-12-30 11:43:35,435 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 1146
2015-12-30 11:43:35,435 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 841167680
2015-12-30 11:43:35,435 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 1201668096
2015-12-30 11:43:35,435 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 300417020; length = 75104256
2015-12-30 11:43:35,448 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2015-12-30 11:43:35,904 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output
2015-12-30 11:43:35,905 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output
2015-12-30 11:43:35,905 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 106017; bufvoid = 1201668096
2015-12-30 11:43:35,905 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 300417020(1201668080); kvend = 300417016(1201668064); length = 5/75104256
2015-12-30 11:43:35,912 INFO [main] org.apache.hadoop.mapred.MapTask: Finished spill 0
2015-12-30 11:43:35,921 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
at org.apache.hadoop.io.Text.readFields(Text.java:291)
at org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:96)
at org.elasticsearch.hadoop.mr.WritableArrayWritable.readFields(WritableArrayWritable.java:54)
at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:188)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:2247)
at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2220)
at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.nextKeyValue(SequenceFileRecordReader.java:78)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

2015-12-30 11:43:35,925 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
`

The odd thing, is that it occurs always on the same files. I suspect that the files are somehow corrupted, but I don't understand, how to repair (best solution fo course) or bypass.

Does anyone has a clue on this ?

BR,
Aurelien

krisgeus · December 31, 2015, 7:37am

I've seen this recently with a sequence file which was exactly 128MB (same as blocksize). Could you verify that this is the case with the specific file you mention?

Aurelien_3 · December 31, 2015, 9:20am

Hi Kris,

Thank you for the idea, but no ! All the failing files are of different size ranging from 20MB to 230MB.

At first I was thinking that my entries were somehow too big and so, it was failing on the last entry.
But according to specification, a mapper will read the sequence file until it finds the end of entry and will bypass the block cuts to read correctly each entry.

Aurelien

costin · December 31, 2015, 3:29pm

Reparing data in HDFS is a bit of black magic since due to the replication (by default 3) it should not occur in the first place. My advice is to first back it up and try to check it and see whether something sticks out.
Then do the typical 'reboot' - delete it and add it again; this should refresh the namenode at least.
You could also try checking the low HDFS infrastructure and check whether all the copies/replicas are the same or not.

Unfortunately the problem is Hadoop specific so you might get more information from your distro forums then here.

Aurelien_3 · January 8, 2016, 10:49am

Thanks for the ideas. I'm digging into this.

Topic		Replies	Views
Found unrecoverable error for ES 6.0 Elasticsearch	8	1662	January 1, 2018
How to make ES cluster resilient to FileSystemException Elasticsearch	10	2780	July 5, 2017
Unable to take snapshot and restore using repository-hdfs Elasticsearch	1	949	July 6, 2017
How do I storage ES data into HDFS Elasticsearch	5	622	October 15, 2020
Did any body try using hadoop-fuse mounted file system instead of local file system? Elasticsearch	1	387	July 6, 2017

EOFException while reading SequenceFile to restore ES

Related topics