Difference between df and du in linux


(Mustafa Sener) #1

Hi,
In one of our production clusters we saw that there is a big difference
between df and du disk size reports. df shows 100% and du shows as 70%. This
remains as it is while cluster is running. Do you have any ideas about the
reason of this problem?

--
Mustafa Sener
www.ifountain.com


(Joaquin Cuenca Abela) #2

df shows disk space usage on your volume, du disk used by the list
that it can access. The remaining 30% is in files that du is not
seeing. Assuming there are no obvious mistakes, this usually comes
from big files that have been deleted from the file system but that
are still mapped by some process. For instance, maybe you deleted a
big log file, but forgot to restart / reload the server that created
this log file.

On Mon, Apr 25, 2011 at 5:24 PM, Mustafa Sener mustafa.sener@gmail.com wrote:

Hi,
In one of our production clusters we saw that there is a big difference
between df and du disk size reports. df shows 100% and du shows as 70%. This
remains as it is while cluster is running. Do you have any ideas about the
reason of this problem?

--
Mustafa Sener
www.ifountain.com

--
Joaquin Cuenca Abela -- presspeople.com: Fuentes de prensa y comunicados


(Mustafa Sener) #3

That is the reason why I report this situation. I am trying to investigate
whether this may caused by ES or not. We come across this situation twice in
our production ES cluster. ES gave an exception about disk space previously.
When we checked file system we saw that df and du give different values. We
restarted servers and our data is corrupted because of this. We restarted
them and repopulated whole data again and after nearly two months later same
situation appeared again. We use ES version 0.15.2. When I searched about
this problem, I saw that other lucene based products face with same problem
if any IndexReader is not closed. may this be a problem in ES too?

On Mon, Apr 25, 2011 at 7:48 PM, Joaquin Cuenca Abela <
joaquin@cuencaabela.com> wrote:

df shows disk space usage on your volume, du disk used by the list
that it can access. The remaining 30% is in files that du is not
seeing. Assuming there are no obvious mistakes, this usually comes
from big files that have been deleted from the file system but that
are still mapped by some process. For instance, maybe you deleted a
big log file, but forgot to restart / reload the server that created
this log file.

On Mon, Apr 25, 2011 at 5:24 PM, Mustafa Sener mustafa.sener@gmail.com
wrote:

Hi,
In one of our production clusters we saw that there is a big difference
between df and du disk size reports. df shows 100% and du shows as 70%.
This
remains as it is while cluster is running. Do you have any ideas about
the
reason of this problem?

--
Mustafa Sener
www.ifountain.com

--
Joaquin Cuenca Abela -- presspeople.com: Fuentes de prensa y comunicados

--
Mustafa Sener
www.ifountain.com


(Yeroc) #4

You can use a command like "lsof | grep deleted" to get a listing of
the deleted files which have open file handles along with the pid that
is holding them. At least you'll know the names of the files and can
validate that they are indeed being held open by elasticsearch. If
you provide the filenames it may at least help narrow down the cause
of the issue.

On Apr 25, 12:27 pm, Mustafa Sener mustafa.se...@gmail.com wrote:

That is the reason why I report this situation. I am trying to investigate
whether this may caused by ES or not. We come across this situation twice in
our production ES cluster. ES gave an exception about disk space previously.
When we checked file system we saw that df and du give different values. We
restarted servers and our data is corrupted because of this. We restarted
them and repopulated whole data again and after nearly two months later same
situation appeared again. We use ES version 0.15.2. When I searched about
this problem, I saw that other lucene based products face with same problem
if any IndexReader is not closed. may this be a problem in ES too?


(Shay Banon) #5

Which version are you using? You might hit this: https://github.com/elasticsearch/elasticsearch/issues/823 which was fixed in 0.16.
On Monday, April 25, 2011 at 11:34 PM, Yeroc wrote:

You can use a command like "lsof | grep deleted" to get a listing of
the deleted files which have open file handles along with the pid that
is holding them. At least you'll know the names of the files and can
validate that they are indeed being held open by elasticsearch. If
you provide the filenames it may at least help narrow down the cause
of the issue.

On Apr 25, 12:27 pm, Mustafa Sener mustafa.se...@gmail.com wrote:

That is the reason why I report this situation. I am trying to investigate
whether this may caused by ES or not. We come across this situation twice in
our production ES cluster. ES gave an exception about disk space previously.
When we checked file system we saw that df and du give different values. We
restarted servers and our data is corrupted because of this. We restarted
them and repopulated whole data again and after nearly two months later same
situation appeared again. We use ES version 0.15.2. When I searched about
this problem, I saw that other lucene based products face with same problem
if any IndexReader is not closed. may this be a problem in ES too?


(Mustafa Sener) #6

I am using version 0.15.2

On Tue, Apr 26, 2011 at 8:45 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Which version are you using? You might hit this:
https://github.com/elasticsearch/elasticsearch/issues/823 which was fixed
in 0.16.

On Monday, April 25, 2011 at 11:34 PM, Yeroc wrote:

You can use a command like "lsof | grep deleted" to get a listing of
the deleted files which have open file handles along with the pid that
is holding them. At least you'll know the names of the files and can
validate that they are indeed being held open by elasticsearch. If
you provide the filenames it may at least help narrow down the cause
of the issue.

On Apr 25, 12:27 pm, Mustafa Sener mustafa.se...@gmail.com wrote:

That is the reason why I report this situation. I am trying to investigate
whether this may caused by ES or not. We come across this situation twice
in
our production ES cluster. ES gave an exception about disk space
previously.
When we checked file system we saw that df and du give different values. We
restarted servers and our data is corrupted because of this. We restarted
them and repopulated whole data again and after nearly two months later
same
situation appeared again. We use ES version 0.15.2. When I searched about
this problem, I saw that other lucene based products face with same problem
if any IndexReader is not closed. may this be a problem in ES too?

--
Mustafa Sener
www.ifountain.com


(Sourav Gulati) #7

Memory used by all the processes which are open will be added in df not in du. you can run lsof and count the momory used by open processes and add it in the output of du , you will get the same result as given in df .
For example, if df shows the file system which is mounted on /tmp consuming 100% memory then try to run lsof | grep /tmp and count the memory used . The result of df and du will match


(system) #8