"stacktrace": ["org.apache.lucene.index.CorruptIndexException: compound sub-files must have a valid codec header and footer: file is too small (0 bytes) (resource=BufferedChecksumIndexInput)

Aravindh_M · July 13, 2023, 11:40am

Recently, we have been encountering the "CorruptIndexException" frequently, accompanied by the following stacktrace: "org.apache.lucene.index.CorruptIndexException: compound sub-files must have a valid codec header and footer: file is too small (0 bytes) (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/usr/share/elasticsearch/data/nodes/0/indices/UAS5VDw1Sv6xrrrvoN39Bw/1/index/_52.kdm")))".

Upon checking the "_52.kdm" file, we found that it actually contains 143 bytes. Has anyone else encountered a similar issue?

We are currently using Elasticsearch version 7.17.5 and spring-data-elasticsearch version 4.4.2.

DavidTurner · July 13, 2023, 1:17pm

Almost certainly that means your storage doesn't work correctly under concurrent access. Are you using local disks or something network-attached?

See these docs for more information.

Aravindh_M · July 13, 2023, 1:35pm

Thanks for your response @DavidTurner. We are using VM local disk[Gluster file system].

Christian_Dahlqvist · July 13, 2023, 1:41pm

Have a look at the folowing, potentially related issues:

DavidTurner · July 13, 2023, 1:48pm

Yeah GlusterFS isn't at all a local disk and the error you're seeing indicates it does not behave like a local disk accurately enough for Elasticsearch. See these docs for more information:

Elasticsearch requires the filesystem to act as if it were backed by a local disk, but this means that it will work correctly on properly-configured remote block devices (e.g. a SAN) and remote filesystems (e.g. NFS) as long as the remote storage behaves no differently from local storage.

Aravindh_M · July 13, 2023, 1:58pm

Thanks for your response @Christian_Dahlqvist. I'll have a look.

Aravindh_M · July 13, 2023, 2:02pm

Thanks @DavidTurner. We will go through the docs and will review our setup.

Aravindh_M · July 14, 2023, 11:40am

@DavidTurner,
We created a disk on SAN storage and set up a new VM with the disk. Docker and Docker Compose have been installed on the VM, and the same VM functions as a worker node within a swarm cluster.

On the VM, we created a directory named "/data" and mounted it as a GlusterFS volume to ensure persistent data across the worker nodes.

I would like to clarify that the Gluster setup is not based on a shared volume or disk. Each worker node has its own local disk, and the GlusterFS directory is mounted on each node to achieve data persistence.

Could you please confirm if this setup appears to be fine or if it might be the cause of the issue?

DavidTurner · July 14, 2023, 12:16pm

I can confirm that there is definitely something in your storage setup that is not fine (i.e. doesn't behave like a local disk as ES requires). Since the problem is outside of ES, I can't really help you pin it down further. But I am suspicious of GlusterFS because it has had problems like this before, and GlusterFS 7 in particular has been EOL and unmaintained for years.

As per the docs I linked above:

To narrow down the source of the corruptions, systematically change components in your cluster’s environment until the corruptions stop.

In particular try using a more common filesystem instead of GlusterFS and see if the problems go away.

JeyakumarKarunanithi · July 25, 2023, 6:41am

Hi @DavidTurner ,
We are planning to move On-Prem servers to Azure cloud and planning to have the Elasticsearch data in Azure files as persistent storage.

Do we have any known issues to have the Elasticsearch data on Azure files storage?

DavidTurner · July 25, 2023, 6:59am

I know of no issues with Azure persistent storage, but I am also not familiar with its various configuration options and also think we don't run many (any?) tests with it. If you encounter problems, you'll need to contact the Azure folks for help.

JeyakumarKarunanithi · July 25, 2023, 7:19am

Sure thank you, what are your recommendations as Shared storage options for running Elasticsearch with Docker swarm setup.?

Christian_Dahlqvist · July 25, 2023, 7:34am

It looks like Azure File is distributed storage accessed via SMB or NFS. This type of storage can often result in very poor performance and may not necessarily behave like local storage like David described is required. I therefore would not rule out that you may experience similar corruption issues with this but am also not aware of any reported issues.

I would recommend using premium or standard storage.

JeyakumarKarunanithi · July 25, 2023, 7:43am

Thanks @Christian_Dahlqvist , even though we goes with Standard/Premium disk, I cant move the container across the other worker node in the cluster since I do not the have the shared mount path and I will lost my data when the container moves to another node, So Im looking for a solution that I can run Elasticsearch container on any of the available node without losing the data/corrupting the data.

Looking for some suggestions on my need. Kindly help...

DavidTurner · July 25, 2023, 7:45am

I think the problem here is likely Docker Swarm - AIUI it doesn't really work very well with stateful applications like Elasticsearch.

JeyakumarKarunanithi · July 25, 2023, 7:47am

How about Kubernetes/Minikube..?

Christian_Dahlqvist · July 25, 2023, 7:47am

I have no experience using Docker Swarm so can not comment on that.

Christian_Dahlqvist · July 25, 2023, 7:48am

There are lots of users running Elasticsearch successfully on kubernetes. It does support persistent volumes so does not require shared storage the same way.

JeyakumarKarunanithi · July 25, 2023, 7:51am

Oh okay, when it comes to Kubernetes persistent volumes, there might be need to share the volume with other nodes to keep the pod highly available, also when we go with more than 1 replicas on multiple nodes, we need to keep the volume as shared between the nodes. then only the same data can be available across all the replicas.
What type of volumes can be used for this requirement?

JeyakumarKarunanithi · July 28, 2023, 7:14am

@Christian_Dahlqvist @DavidTurner Looking for your kind suggestions on this

Topic		Replies	Views
Elk on Docker Swarm and glusterFS crash Elasticsearch	4	4906	May 15, 2018
Facing java.io.EOFException: read past EOF exception and org.apache.lucene.index.CorruptIndexException: compound sub-files must have a valid codec header and footer: file is too small (0 bytes) in elastic 7.17.5 Elasticsearch docker	5	464	August 11, 2023
ES data on Glusterfs Elasticsearch	5	5678	July 5, 2017
Error: CorruptIndexException when reading from gateway Elasticsearch	5	975	July 6, 2017
Corrupt index, checksum failed Elasticsearch	1	1082	July 6, 2017

"stacktrace": ["org.apache.lucene.index.CorruptIndexException: compound sub-files must have a valid codec header and footer: file is too small (0 bytes) (resource=BufferedChecksumIndexInput)

Related topics