Cpu load getting high (0.90.3)


(Gregory S) #1

Hi all,

I am trying to find out what could be causing system load to be over 6.5 on
a 6 cores server. This is not yet critically alarming but this does not
look great. Before throwing more CPU at the problem I would like to
troubleshoot and figure out what is the best solution here.
I have gist a hot thread dumps and some more info. Please find the links
bellow. Thank you for helping out.

Elasticsearch JVM stats

https://gist.github.com/Gster1/9459f2e78893609bf713
Elasticsearch Hot_threads dump and systems info
https://gist.github.com/Gster1/23a1be1089a8d1f6fde1

Greg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/57fa2147-1daf-485c-985a-0fb8f2746273%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jason Wee) #2

Hi Greg,

do you have statistics for IO ?

Jason

On Fri, Jan 3, 2014 at 8:00 AM, Gregory S g.saramite@gmail.com wrote:

Hi all,

I am trying to find out what could be causing system load to be over 6.5
on a 6 cores server. This is not yet critically alarming but this does not
look great. Before throwing more CPU at the problem I would like to
troubleshoot and figure out what is the best solution here.
I have gist a hot thread dumps and some more info. Please find the links
bellow. Thank you for helping out.

Elasticsearch JVM stats

https://gist.github.com/Gster1/9459f2e78893609bf713
Elasticsearch Hot_threads dump and systems info
https://gist.github.com/Gster1/23a1be1089a8d1f6fde1

Greg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/57fa2147-1daf-485c-985a-0fb8f2746273%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHO4itwa3iAuqY5n9_qQuzXQ_M9OMBQREKCxK%3DbggbtQ%2BANNUQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Gregory S) #3

Hi Jason,

Here is a gist with IO statistics
https://gist.github.com/Gster1/6aa9a689c2325823f315

Thank you

Greg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f83db29c-ff9b-4508-b977-ea370ec261c3%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #4

Are the geo queries all kind of queries you execute? How much is the query
load?

It seems you use heavy filters or something CPU intensive.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHwEJ_PHTSeWBy%3DACq6inEF5LN4D86Nsoz7oF-3M0XEjQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Gregory S) #5

HI Jörg,

This is the only query being executed on this cluster. There is about 40
query / sec and docs count is ~ 10 Millions
Here is the query: https://gist.github.com/Gster1/cd4a511013576ba19621

Thank you

Greg

On Thu, Jan 2, 2014 at 11:57 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Are the geo queries all kind of queries you execute? How much is the query
load?

It seems you use heavy filters or something CPU intensive.

Jörg

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/PRaWznbsZjw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHwEJ_PHTSeWBy%3DACq6inEF5LN4D86Nsoz7oF-3M0XEjQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BSiP%2BXiSGa6u-JAjU16xpSyXNUFN69jJgXzqnogyBrVCNwQaQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Otis Gospodnetić) #6

Hi Greg,

The CPU usage is high? Can you share some graphs that show trends? Is the
CPU wait time high by some chance? user? system? Can you correlate CPU
usage with disk IO or GC?

You can easily look at this sort of stuff with SPM for ES and send any
graphs you want directly to this list, so we can see them and help.

SPM for ES: http://sematext.com/spm/elasticsearch-performance-monitoring/

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thursday, January 2, 2014 7:00:19 PM UTC-5, Gregory S wrote:

Hi all,

I am trying to find out what could be causing system load to be over 6.5
on a 6 cores server. This is not yet critically alarming but this does not
look great. Before throwing more CPU at the problem I would like to
troubleshoot and figure out what is the best solution here.
I have gist a hot thread dumps and some more info. Please find the links
bellow. Thank you for helping out.

Elasticsearch JVM stats

https://gist.github.com/Gster1/9459f2e78893609bf713
Elasticsearch Hot_threads dump and systems info
https://gist.github.com/Gster1/23a1be1089a8d1f6fde1

Greg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/96830394-f3f8-4af4-8dc0-2c18981aafef%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Gregory S) #7

Hi Otis,

Here some interesting trends. Basically this seems to confirm we are not IO
bound. Also The Load, CPU, Garbage collection, Write IO per seconds and
Query Latency increase with the Query count (see attached graphs).
This is all expected. The only thing that concerns me the most is that
Query response time is starting to slow down significantly (~200 ms) and
the Load is going above the number of cores (6) during peak traffic...

Thank you

Greg

On Fri, Jan 3, 2014 at 7:04 PM, Otis Gospodnetic <otis.gospodnetic@gmail.com

wrote:

Hi Greg,

The CPU usage is high? Can you share some graphs that show trends? Is
the CPU wait time high by some chance? user? system? Can you correlate CPU
usage with disk IO or GC?

You can easily look at this sort of stuff with SPM for ES and send any
graphs you want directly to this list, so we can see them and help.

SPM for ES: http://sematext.com/spm/elasticsearch-performance-monitoring/

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thursday, January 2, 2014 7:00:19 PM UTC-5, Gregory S wrote:

Hi all,

I am trying to find out what could be causing system load to be over 6.5
on a 6 cores server. This is not yet critically alarming but this does not
look great. Before throwing more CPU at the problem I would like to
troubleshoot and figure out what is the best solution here.
I have gist a hot thread dumps and some more info. Please find the links
bellow. Thank you for helping out.

Elasticsearch JVM stats

https://gist.github.com/Gster1/9459f2e78893609bf713
Elasticsearch Hot_threads dump and systems info
https://gist.github.com/Gster1/23a1be1089a8d1f6fde1

Greg

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/PRaWznbsZjw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/96830394-f3f8-4af4-8dc0-2c18981aafef%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BSiP%2BVC%3DHhJ_maT_iApk_v7_SbSQgpOj36_1n1PJfnQoEHPPQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Otis Gospodnetić) #8

Hi Gregory,

So you have about 140K queries in 1 hour there in one of the graphs and the
latency is close to 200 ms on avg. on a server with 6 cores.
140K queries per hour ==> 140K/60/60 = ~39 QPS
On a server with 6 cores this means 39/6 = 6.5 QPS/core
Each query being avg 200 ms means 6.5 * 0.200 = 1.3

I believe this can be roughly interpreted as "during each second a core has
to do 1.3 seconds worth of work", which leads to some waiting on the CPU,
which is why you see that load.
I don't have the explanation for why the CPU is not at 100%. Maybe because
of those disk writes, which contribute to the load but end up making the
CPU wait? In that case, I'm not sure why we don't see any wait time on the
CPU graphs, unless you removed that metric from the CPU graph.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Monday, January 6, 2014 8:38:12 PM UTC-5, Gregory S wrote:

Hi Otis,

Here some interesting trends. Basically this seems to confirm we are not
IO bound. Also The Load, CPU, Garbage collection, Write IO per seconds and
Query Latency increase with the Query count (see attached graphs).
This is all expected. The only thing that concerns me the most is that
Query response time is starting to slow down significantly (~200 ms) and
the Load is going above the number of cores (6) during peak traffic...

Thank you

Greg

On Fri, Jan 3, 2014 at 7:04 PM, Otis Gospodnetic <otis.gos...@gmail.com<javascript:>

wrote:

Hi Greg,

The CPU usage is high? Can you share some graphs that show trends? Is
the CPU wait time high by some chance? user? system? Can you correlate CPU
usage with disk IO or GC?

You can easily look at this sort of stuff with SPM for ES and send any
graphs you want directly to this list, so we can see them and help.

SPM for ES: http://sematext.com/spm/elasticsearch-performance-monitoring/

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thursday, January 2, 2014 7:00:19 PM UTC-5, Gregory S wrote:

Hi all,

I am trying to find out what could be causing system load to be over 6.5
on a 6 cores server. This is not yet critically alarming but this does not
look great. Before throwing more CPU at the problem I would like to
troubleshoot and figure out what is the best solution here.
I have gist a hot thread dumps and some more info. Please find the links
bellow. Thank you for helping out.

Elasticsearch JVM stats

https://gist.github.com/Gster1/9459f2e78893609bf713
Elasticsearch Hot_threads dump and systems info
https://gist.github.com/Gster1/23a1be1089a8d1f6fde1

Greg

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/PRaWznbsZjw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/96830394-f3f8-4af4-8dc0-2c18981aafef%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/45b941f0-5c07-47a3-8a34-3189ef5a7bf2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #9