Very high disk IO while indexing

Bruno_Miranda · March 19, 2013, 4:32am

While indexing to our QA environment, 2 nodes (EC2 m1.large) each with 2
CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/371H01382h2O

I have about 20 processes indexing simultaneously. The index is 3.3 million
documents and about 8GB on disk. The refresh internal is set at 5 minutes
while the reindex is going, the index is 5 shards and 1 replica. I haven't
changed the merge policy.

Cutting the concurrent process to 20 to 10 definitely lowers the IO from
98% to around 50%.

Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Bruno_Miranda · March 19, 2013, 4:45am

Here are the index
_settings: https://gist.github.com/brupm/d7fe657a9501e617d46c

On Monday, March 18, 2013 9:32:02 PM UTC-7, Bruno Miranda wrote:

While indexing to our QA environment, 2 nodes (EC2 m1.large) each with 2
CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/371H01382h2O

I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.

Cutting the concurrent process to 20 to 10 definitely lowers the IO from
98% to around 50%.

Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · March 19, 2013, 8:22am

Beside tackling I/O with ES, you should also use disk monitoring, maybe
there is a faulty disk drive... but, don't ask me how this works in EC2
environment, I assume it's now difference to local disk management.

Jörg

Am 19.03.13 05:45, schrieb Bruno Miranda:

Here are the index
_settings: https://gist.github.com/brupm/d7fe657a9501e617d46c

On Monday, March 18, 2013 9:32:02 PM UTC-7, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each
with 2 CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/371H01382h2O <http://cl.ly/image/371H01382h2O>

I have about 20 processes indexing simultaneously. The index is
3.3 million documents and about 8GB on disk. The refresh internal
is set at 5 minutes while the reindex is going, the index is 5
shards and 1 replica. I haven't changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the
IO from 98% to around 50%.

Any tips on lowering disk IO? Is this normal? Reindex seems to
finish properly, all records are properly indexed but the 100%
disk IO scares me.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

spinscale · March 19, 2013, 8:32am

Hey,

you could play around with throttling the merge throughput in order to
lower disk utilization, see

In addition, try to check if you can decrease your index size as well
(maybe by deactivating the all field, storing less fields, etc)

If I check your first graph correctly, the writing rate per second is not
that high (but I do not know what is defined fast or slow on AWS either)

--Alex

On Tue, Mar 19, 2013 at 9:22 AM, Jörg Prante joergprante@gmail.com wrote:

Beside tackling I/O with ES, you should also use disk monitoring, maybe
there is a faulty disk drive... but, don't ask me how this works in EC2
environment, I assume it's now difference to local disk management.

Jörg

Am 19.03.13 05:45, schrieb Bruno Miranda:
Here are the index _settings: brupm’s gists · GitHub**
d7fe657a9501e617d46c https://gist.github.com/brupm/d7fe657a9501e617d46c

On Monday, March 18, 2013 9:32:02 PM UTC-7, Bruno Miranda wrote:
While indexing to our QA environment, 2 nodes (EC2 m1.large) each
with 2 CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/**371H01382h2O <http://cl.ly/image/371H01382h2O> <
http://cl.ly/image/**371H01382h2O http://cl.ly/image/371H01382h2O>
I have about 20 processes indexing simultaneously. The index is
3.3 million documents and about 8GB on disk. The refresh internal
is set at 5 minutes while the reindex is going, the index is 5
shards and 1 replica. I haven't changed the merge policy.
Cutting the concurrent process to 20 to 10 definitely lowers the
IO from 98% to around 50%.

Any tips on lowering disk IO? Is this normal? Reindex seems to
finish properly, all records are properly indexed but the 100%
disk IO scares me.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@**googlegroups.com elasticsearch%2Bunsubscribe@googlegroups.com
.
For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@**googlegroups.com elasticsearch%2Bunsubscribe@googlegroups.com
.
For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Clinton_Gormley · March 19, 2013, 11:53am

On Mon, 2013-03-18 at 21:32 -0700, Bruno Miranda wrote:

While indexing to our QA environment, 2 nodes (EC2 m1.large) each with
2 CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/371H01382h2O

I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set
at 5 minutes while the reindex is going, the index is 5 shards and 1
replica. I haven't changed the merge policy.

Look at using merge throttling. Merges can be heavy and use a lot of IO.
With throttling, merges will still happen, but won't swamp your system

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

otisg · March 19, 2013, 7:01pm

Hi,

I see you have "SPM" bookmarked in your browser. You should look at graphs
under the "Index Stats" tab -- these:
https://apps.sematext.com/spm-reports/mainPage.do#report_anchor_esRefreshFlushMerge
to see what's going on with ES/Lucene refreshing, flushing, and merging as
you make changes to throttle merges that others have suggested.

Otis

ELASTICSEARCH Performance Monitoring - Sematext Monitoring | Infrastructure Monitoring Service

On Tuesday, March 19, 2013 12:32:02 AM UTC-4, Bruno Miranda wrote:

While indexing to our QA environment, 2 nodes (EC2 m1.large) each with 2
CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/371H01382h2O

I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.

Cutting the concurrent process to 20 to 10 definitely lowers the IO from
98% to around 50%.

Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Bruno_Miranda · March 19, 2013, 7:03pm

Need to convince the CEO to pay for the plan. I can only see 30 minutes.
(Which should help me diagnose)

Thank you for the reminder.

On Tuesday, March 19, 2013 12:01:14 PM UTC-7, Otis Gospodnetic wrote:

Hi,

I see you have "SPM" bookmarked in your browser. You should look at
graphs under the "Index Stats" tab -- these:
https://apps.sematext.com/spm-reports/mainPage.do#report_anchor_esRefreshFlushMergeto see what's going on with ES/Lucene refreshing, flushing, and merging as
you make changes to throttle merges that others have suggested.

Otis

ELASTICSEARCH Performance Monitoring - Sematext Monitoring | Infrastructure Monitoring Service

On Tuesday, March 19, 2013 12:32:02 AM UTC-4, Bruno Miranda wrote:

While indexing to our QA environment, 2 nodes (EC2 m1.large) each with 2
CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/371H01382h2O

I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.

Cutting the concurrent process to 20 to 10 definitely lowers the IO from
98% to around 50%.

Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

simonw_2 · March 20, 2013, 8:35pm

which version of ES are you using?

simon

On Tuesday, March 19, 2013 8:03:56 PM UTC+1, Bruno Miranda wrote:

Need to convince the CEO to pay for the plan. I can only see 30 minutes.
(Which should help me diagnose)

Thank you for the reminder.

On Tuesday, March 19, 2013 12:01:14 PM UTC-7, Otis Gospodnetic wrote:

Hi,

I see you have "SPM" bookmarked in your browser. You should look at
graphs under the "Index Stats" tab -- these:
https://apps.sematext.com/spm-reports/mainPage.do#report_anchor_esRefreshFlushMergeto see what's going on with ES/Lucene refreshing, flushing, and merging as
you make changes to throttle merges that others have suggested.

Otis

ELASTICSEARCH Performance Monitoring - Sematext Monitoring | Infrastructure Monitoring Service

On Tuesday, March 19, 2013 12:32:02 AM UTC-4, Bruno Miranda wrote:

While indexing to our QA environment, 2 nodes (EC2 m1.large) each with 2
CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/371H01382h2O

I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.

Cutting the concurrent process to 20 to 10 definitely lowers the IO from
98% to around 50%.

Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Bruno_Miranda · March 20, 2013, 9:43pm

0.20.5

On Wednesday, March 20, 2013 1:35:16 PM UTC-7, simonw wrote:

which version of ES are you using?

simon

On Tuesday, March 19, 2013 8:03:56 PM UTC+1, Bruno Miranda wrote:

Need to convince the CEO to pay for the plan. I can only see 30 minutes.
(Which should help me diagnose)

Thank you for the reminder.

On Tuesday, March 19, 2013 12:01:14 PM UTC-7, Otis Gospodnetic wrote:

Hi,

I see you have "SPM" bookmarked in your browser. You should look at
graphs under the "Index Stats" tab -- these:
https://apps.sematext.com/spm-reports/mainPage.do#report_anchor_esRefreshFlushMergeto see what's going on with ES/Lucene refreshing, flushing, and merging as
you make changes to throttle merges that others have suggested.

Otis

ELASTICSEARCH Performance Monitoring -
Sematext Monitoring | Infrastructure Monitoring Service

On Tuesday, March 19, 2013 12:32:02 AM UTC-4, Bruno Miranda wrote:

While indexing to our QA environment, 2 nodes (EC2 m1.large) each with
2 CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/371H01382h2O

I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.

Cutting the concurrent process to 20 to 10 definitely lowers the IO
from 98% to around 50%.

Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Bruno_Miranda · March 20, 2013, 9:44pm

I seem to have lowered the IO by limiting the max_bytes_per_sec to 10mb, I
guess there was no limit before by default.

On Wednesday, March 20, 2013 2:43:33 PM UTC-7, Bruno Miranda wrote:

0.20.5

On Wednesday, March 20, 2013 1:35:16 PM UTC-7, simonw wrote:

which version of ES are you using?

simon

On Tuesday, March 19, 2013 8:03:56 PM UTC+1, Bruno Miranda wrote:

Need to convince the CEO to pay for the plan. I can only see 30 minutes.
(Which should help me diagnose)

Thank you for the reminder.

On Tuesday, March 19, 2013 12:01:14 PM UTC-7, Otis Gospodnetic wrote:

Hi,

I see you have "SPM" bookmarked in your browser. You should look at
graphs under the "Index Stats" tab -- these:
https://apps.sematext.com/spm-reports/mainPage.do#report_anchor_esRefreshFlushMergeto see what's going on with ES/Lucene refreshing, flushing, and merging as
you make changes to throttle merges that others have suggested.

Otis

ELASTICSEARCH Performance Monitoring -
Sematext Monitoring | Infrastructure Monitoring Service

On Tuesday, March 19, 2013 12:32:02 AM UTC-4, Bruno Miranda wrote:

While indexing to our QA environment, 2 nodes (EC2 m1.large) each with
2 CPUs 7.3 GBs RAM I am seeing exceptionally high Disk IO.
http://cl.ly/image/371H01382h2O

I have about 20 processes indexing simultaneously. The index is 3.3
million documents and about 8GB on disk. The refresh internal is set at 5
minutes while the reindex is going, the index is 5 shards and 1 replica. I
haven't changed the merge policy.

Cutting the concurrent process to 20 to 10 definitely lowers the IO
from 98% to around 50%.

Any tips on lowering disk IO? Is this normal? Reindex seems to finish
properly, all records are properly indexed but the 100% disk IO scares me.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Indexing is being throttled Elasticsearch	7	2683	July 6, 2017
Slow Indexing Speed Elasticsearch	5	7241	July 6, 2017
High HD utilization Elasticsearch	16	7474	July 6, 2017
Understanding scaling & clustering decision making Elasticsearch	11	480	July 6, 2017
High disc utilization Elasticsearch	6	858	July 6, 2017

Very high disk IO while indexing

Otis

Otis

Otis

Otis

Otis

Related topics