While indexing to our QA environment, 2 nodes (EC2 m1.large) each with 2
CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO. http://cl.ly/image/371H01382h2O
I have about 20 processes indexing simultaneously. The index is 3.3 million
documents and about 8GB on disk. The refresh internal is set at 5 minutes
while the reindex is going, the index is 5 shards and 1 replica. I haven't
changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the IO from
98% to around 50%.
Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.
On Monday, March 18, 2013 9:32:02 PM UTC-7, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each with 2
CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO. http://cl.ly/image/371H01382h2O
I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the IO from
98% to around 50%.
Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.
Beside tackling I/O with ES, you should also use disk monitoring, maybe
there is a faulty disk drive... but, don't ask me how this works in EC2
environment, I assume it's now difference to local disk management.
On Monday, March 18, 2013 9:32:02 PM UTC-7, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each
with 2 CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/371H01382h2O <http://cl.ly/image/371H01382h2O>
I have about 20 processes indexing simultaneously. The index is
3.3 million documents and about 8GB on disk. The refresh internal
is set at 5 minutes while the reindex is going, the index is 5
shards and 1 replica. I haven't changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the
IO from 98% to around 50%.
Any tips on lowering disk IO? Is this normal? Reindex seems to
finish properly, all records are properly indexed but the 100%
disk IO scares me.
Beside tackling I/O with ES, you should also use disk monitoring, maybe
there is a faulty disk drive... but, don't ask me how this works in EC2
environment, I assume it's now difference to local disk management.
On Monday, March 18, 2013 9:32:02 PM UTC-7, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each
with 2 CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/**371H01382h2O <http://cl.ly/image/371H01382h2O> <
I have about 20 processes indexing simultaneously. The index is
3.3 million documents and about 8GB on disk. The refresh internal
is set at 5 minutes while the reindex is going, the index is 5
shards and 1 replica. I haven't changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the
IO from 98% to around 50%.
Any tips on lowering disk IO? Is this normal? Reindex seems to
finish properly, all records are properly indexed but the 100%
disk IO scares me.
On Mon, 2013-03-18 at 21:32 -0700, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each with
2 CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO. http://cl.ly/image/371H01382h2O
I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set
at 5 minutes while the reindex is going, the index is 5 shards and 1
replica. I haven't changed the merge policy.
Look at using merge throttling. Merges can be heavy and use a lot of IO.
With throttling, merges will still happen, but won't swamp your system
I see you have "SPM" bookmarked in your browser. You should look at graphs
under the "Index Stats" tab -- these: https://apps.sematext.com/spm-reports/mainPage.do#report_anchor_esRefreshFlushMerge
to see what's going on with ES/Lucene refreshing, flushing, and merging as
you make changes to throttle merges that others have suggested.
On Tuesday, March 19, 2013 12:32:02 AM UTC-4, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each with 2
CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO. http://cl.ly/image/371H01382h2O
I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the IO from
98% to around 50%.
Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.
On Tuesday, March 19, 2013 12:32:02 AM UTC-4, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each with 2
CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO. http://cl.ly/image/371H01382h2O
I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the IO from
98% to around 50%.
Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.
On Tuesday, March 19, 2013 12:32:02 AM UTC-4, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each with 2
CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO. http://cl.ly/image/371H01382h2O
I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the IO from
98% to around 50%.
Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.
On Tuesday, March 19, 2013 12:32:02 AM UTC-4, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each with
2 CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO. http://cl.ly/image/371H01382h2O
I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the IO
from 98% to around 50%.
Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.
On Tuesday, March 19, 2013 12:32:02 AM UTC-4, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each with
2 CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO. http://cl.ly/image/371H01382h2O
I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the IO
from 98% to around 50%.
Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.