Elasticsearch high load/CPU usage


(Alpha01) #1

Hello,

We have an Elasticserach cluster configured with 5 nodes. Every so often
one random node will get an extremely high load/CPU usage. We thought the
issue was garbage collection related because of the frequent Eden space
garbage collection occurences the nodes were having. We increased the total
heap size of the application to 4 GB and set the oldgen survival to be 1.5
GB. However now, although it is not as refrequent, we still seem to be
having sporadic high load/cpu on any node that is part of our Elastsearch
cluster. Does anyone now how we can optmize Java/Elasticsearch more in
order to avoid having one of our nodes getting extremely high load/CPU
usage?

set.default.ES_HEAP_SIZE=4096
-XX:NewSize=1536m

Node details:
CentOS 6
ElasticSearch Version: 0.19.1, JVM: 19.1-b02
6 GB RAM
2.53GHz Intel Xenon (4 cores per node)

Thanks,
-Tony

--


(Sisu Alexandru) #2

Maybe it would help if you would turn on the jmx monitoring in es, and use
visualvm to check gc and the threads.
Do that:
In config/elasticsearch.yml add

jmx.create_connector: true
jmx.port: 9400-9500
jmx.domain: elasticsearch

And restart es with:
./bin/elasticsearch -Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.port=3333

Maybe you'll get more info.

Cheers,

Alex

On Thu, Sep 27, 2012 at 11:53 PM, Alpha01 openalpha01@gmail.com wrote:

Hello,

We have an Elasticserach cluster configured with 5 nodes. Every so often
one random node will get an extremely high load/CPU usage. We thought the
issue was garbage collection related because of the frequent Eden space
garbage collection occurences the nodes were having. We increased the total
heap size of the application to 4 GB and set the oldgen survival to be 1.5
GB. However now, although it is not as refrequent, we still seem to be
having sporadic high load/cpu on any node that is part of our Elastsearch
cluster. Does anyone now how we can optmize Java/Elasticsearch more in
order to avoid having one of our nodes getting extremely high load/CPU
usage?

set.default.ES_HEAP_SIZE=4096
-XX:NewSize=1536m

Node details:
CentOS 6
ElasticSearch Version: 0.19.1, JVM: 19.1-b02
6 GB RAM
2.53GHz Intel Xenon (4 cores per node)

Thanks,
-Tony

--

--


(Shay Banon) #3

Hi,

First, don't set the new gen heap size to this value, its too high, let it have its default value. I suggest setting ES heap to about 3gb. Upgrade to 0.19.8 (at least), and when you get to high CPU usage, issue:

curl localhost:9200/_nodes/hot_threads

and gist the response in this thread.

On Sep 27, 2012, at 11:53 PM, Alpha01 openalpha01@gmail.com wrote:

Hello,

We have an Elasticserach cluster configured with 5 nodes. Every so often one random node will get an extremely high load/CPU usage. We thought the issue was garbage collection related because of the frequent Eden space garbage collection occurences the nodes were having. We increased the total heap size of the application to 4 GB and set the oldgen survival to be 1.5 GB. However now, although it is not as refrequent, we still seem to be having sporadic high load/cpu on any node that is part of our Elastsearch cluster. Does anyone now how we can optmize Java/Elasticsearch more in order to avoid having one of our nodes getting extremely high load/CPU usage?

set.default.ES_HEAP_SIZE=4096
-XX:NewSize=1536m

Node details:
CentOS 6
ElasticSearch Version: 0.19.1, JVM: 19.1-b02
6 GB RAM
2.53GHz Intel Xenon (4 cores per node)

Thanks,
-Tony

--

--


(Alpha01) #4

Hi Shay,

Thanks for your help on this, and for creating ElasticSearch :slight_smile: .

localhost:9200/_nodes/hot_threads does not seem to provide lots of details
about the nodes/cluster. I'm able to do so by using the following URL
instead within any of our nodes:
curl -XGET 'http://localhost:9200/_status?pretty=true'curl -XGET
'http://localhost:9200/_cluster/nodes/_local/stats?pretty=true'

The following gist contains the output of when one of our nodes was having
high load/cpu usage affecting the cluster.

The following gist contains the output of a node with normal ElasticSearch
load, at the same time our other node is having extremely high load/cpu
usage.

-Tony

On Friday, September 28, 2012 4:31:05 AM UTC-7, kimchy wrote:

Hi,

First, don't set the new gen heap size to this value, its too high, let
it have its default value. I suggest setting ES heap to about 3gb. Upgrade
to 0.19.8 (at least), and when you get to high CPU usage, issue:

curl localhost:9200/_nodes/hot_threads

and gist the response in this thread.

On Sep 27, 2012, at 11:53 PM, Alpha01 <opena...@gmail.com <javascript:>>
wrote:

Hello,

We have an Elasticserach cluster configured with 5 nodes. Every so
often one random node will get an extremely high load/CPU usage. We thought
the issue was garbage collection related because of the frequent Eden space
garbage collection occurences the nodes were having. We increased the total
heap size of the application to 4 GB and set the oldgen survival to be 1.5
GB. However now, although it is not as refrequent, we still seem to be
having sporadic high load/cpu on any node that is part of our Elastsearch
cluster. Does anyone now how we can optmize Java/Elasticsearch more in
order to avoid having one of our nodes getting extremely high load/CPU
usage?

set.default.ES_HEAP_SIZE=4096
-XX:NewSize=1536m

Node details:
CentOS 6
ElasticSearch Version: 0.19.1, JVM: 19.1-b02
6 GB RAM
2.53GHz Intel Xenon (4 cores per node)

Thanks,
-Tony

--

--


(Shay Banon) #5

The hot threads is the one that I need, the others don't show why / where CPU is spent.

On Sep 28, 2012, at 8:54 PM, Alpha01 openalpha01@gmail.com wrote:

Hi Shay,

Thanks for your help on this, and for creating ElasticSearch :slight_smile: .

localhost:9200/_nodes/hot_threads does not seem to provide lots of details about the nodes/cluster. I'm able to do so by using the following URL instead within any of our nodes:
curl -XGET 'http://localhost:9200/_status?pretty=true'curl -XGET 'http://localhost:9200/_cluster/nodes/_local/stats?pretty=true'

The following gist contains the output of when one of our nodes was having high load/cpu usage affecting the cluster.
https://gist.github.com/3801445

The following gist contains the output of a node with normal ElasticSearch load, at the same time our other node is having extremely high load/cpu usage.
https://gist.github.com/3801427

-Tony

On Friday, September 28, 2012 4:31:05 AM UTC-7, kimchy wrote:
Hi,

First, don't set the new gen heap size to this value, its too high, let it have its default value. I suggest setting ES heap to about 3gb. Upgrade to 0.19.8 (at least), and when you get to high CPU usage, issue:

curl localhost:9200/_nodes/hot_threads

and gist the response in this thread.

On Sep 27, 2012, at 11:53 PM, Alpha01 opena...@gmail.com wrote:

Hello,

We have an Elasticserach cluster configured with 5 nodes. Every so often one random node will get an extremely high load/CPU usage. We thought the issue was garbage collection related because of the frequent Eden space garbage collection occurences the nodes were having. We increased the total heap size of the application to 4 GB and set the oldgen survival to be 1.5 GB. However now, although it is not as refrequent, we still seem to be having sporadic high load/cpu on any node that is part of our Elastsearch cluster. Does anyone now how we can optmize Java/Elasticsearch more in order to avoid having one of our nodes getting extremely high load/CPU usage?

set.default.ES_HEAP_SIZE=4096
-XX:NewSize=1536m

Node details:
CentOS 6
ElasticSearch Version: 0.19.1, JVM: 19.1-b02
6 GB RAM
2.53GHz Intel Xenon (4 cores per node)

Thanks,
-Tony

--

--

--


(Crwe) #6

I'm experiencing similar problems. The ES log is full of lines like

[2012-09-29 10:33:53,561][WARN ][monitor.jvm ]
[Gargouille] [gc][ParNew][29047][10329] duration [6.7s], collections
[1]/[6.8s], total [6.7s]/[1.2h], memory [3.6gb]->[3.7gb]/[5.9gb],
all_pools {[Code Cache] [6.7mb]->[6.7mb]/[48mb]}{[Par Eden Space]
[3.6mb]->[4.8mb]/[66.5mb]}{[Par Survivor Space] [6.3mb]->[7mb]/[8.3mb]}
{[CMS Old Gen] [3.6gb]->[3.7gb]/[5.9gb]}{[CMS Perm Gen] [37.3mb]-

[37.3mb]/[82mb]}
[2012-09-29 10:34:00,145][WARN ][monitor.jvm ]
[Gargouille] [gc][ParNew][29048][10330] duration [6.4s], collections
[1]/[6.6s], total [6.4s]/[1.2h], memory [3.7gb]->[3.7gb]/[5.9gb],
all_pools {[Code Cache] [6.7mb]->[6.7mb]/[48mb]}{[Par Eden Space]
[4.8mb]->[4.5mb]/[66.5mb]}{[Par Survivor Space] [7mb]->[7.7mb]/[8.3mb]}
{[CMS Old Gen] [3.7gb]->[3.7gb]/[5.9gb]}{[CMS Perm Gen] [37.3mb]-
[37.3mb]/[82mb]}
[2012-09-29 10:34:03,826][WARN ][monitor.jvm ]
[Gargouille] [gc][ParNew][29049][10331] duration [3.6s], collections
[1]/[3.6s], total [3.6s]/[1.2h], memory [3.7gb]->[3.8gb]/[5.9gb],
all_pools {[Code Cache] [6.7mb]->[6.7mb]/[48mb]}{[Par Eden Space]
[4.5mb]->[4.5mb]/[66.5mb]}{[Par Survivor Space] [7.7mb]->[8.3mb]/
[8.3mb]}{[CMS Old Gen] [3.7gb]->[3.8gb]/[5.9gb]}{[CMS Perm Gen]
[37.3mb]->[37.3mb]/[82mb]}

Otis recommended running jstat, so here are the results (from about
the same period as the ES log above, though not necessarily exactly
the same):

$ sudo jstat -gcutil -t 1478 1000
Timestamp S0 S1 E O P YGC YGCT
FGC FGCT GCT
31343.4 77.58 76.91 98.58 53.45 61.68 10320 4493.539
38 3.010 4496.549
31344.6 77.58 76.91 98.58 53.57 61.68 10320 4493.539
38 3.010 4496.549
31345.6 77.58 66.52 96.67 53.69 61.68 10321 4502.456
38 3.010 4505.466
31346.6 77.58 66.52 96.67 53.78 61.68 10321 4502.456
38 3.010 4505.466
31347.6 77.58 72.28 96.67 53.92 61.68 10321 4502.456
38 3.010 4505.466
31348.6 77.58 72.28 96.67 54.06 61.68 10321 4502.456
38 3.010 4505.466
31349.6 77.58 72.28 96.67 54.14 61.68 10321 4502.456
38 3.010 4505.466
31350.6 77.58 72.28 96.67 54.27 61.68 10321 4502.456
38 3.010 4505.466
31351.6 77.58 72.28 96.67 54.41 61.68 10321 4502.456
38 3.010 4505.466
31352.7 77.58 72.28 96.67 54.49 61.68 10321 4502.456
38 3.010 4505.466
31353.6 63.96 72.28 99.51 54.67 61.68 10322 4510.387
38 3.010 4513.397
31354.6 63.96 72.28 99.51 54.80 61.68 10322 4510.387
38 3.010 4513.397
31355.6 63.96 72.28 99.51 54.94 61.68 10322 4510.387
38 3.010 4513.397
31356.6 63.96 72.28 99.51 55.08 61.68 10322 4510.387
38 3.010 4513.397
31357.7 63.96 72.28 99.51 55.21 61.68 10322 4510.387
38 3.010 4513.397
31358.7 85.72 72.28 99.51 55.44 61.68 10322 4510.387
38 3.010 4513.397
31359.6 85.72 72.28 99.51 55.52 61.68 10322 4510.387
38 3.010 4513.397
31360.7 91.94 72.28 99.51 55.58 61.68 10322 4510.387
38 3.010 4513.397
31361.7 91.94 77.34 98.80 55.82 61.68 10323 4518.343
38 3.010 4521.353
31362.7 91.94 77.34 98.80 55.82 61.68 10323 4518.343
38 3.010 4521.353
31363.6 91.94 84.77 98.80 56.03 61.68 10323 4518.343
38 3.010 4521.353
31364.7 91.94 84.77 98.80 56.09 61.68 10323 4518.343
38 3.010 4521.353
31365.7 91.94 84.77 98.80 56.22 61.68 10323 4518.343
38 3.010 4521.353
31366.7 91.94 93.89 98.80 56.41 61.68 10323 4518.343
38 3.010 4521.353
31367.7 91.94 100.00 98.80 56.56 61.68 10323 4518.343
38 3.010 4521.353
31368.8 91.94 100.00 98.80 56.68 61.68 10323 4518.343
38 3.010 4521.353
31369.7 81.31 100.00 96.70 56.87 61.68 10324 4525.871
38 3.010 4528.882
31370.8 81.31 100.00 96.70 57.09 61.68 10324 4525.871
38 3.010 4528.882
31371.8 81.31 100.00 96.70 57.31 61.68 10324 4525.871
38 3.010 4528.882
31372.7 81.31 100.00 96.70 57.65 61.68 10324 4525.871
38 3.010 4528.882
31373.8 81.31 100.00 95.35 58.04 61.68 10325 4530.330
38 3.010 4533.340
31374.8 81.31 100.00 95.35 58.24 61.68 10325 4530.330
38 3.010 4533.340
31375.8 81.31 100.00 95.35 58.43 61.68 10325 4530.330
38 3.010 4533.340
31376.8 81.31 100.00 95.35 58.49 61.68 10325 4530.330
38 3.010 4533.340
31377.8 81.31 100.00 95.35 58.63 61.68 10325 4530.330
38 3.010 4533.340

$ curl -XGET 'localhost:9200/_nodes/hot_threads'
{"ok":true,"cluster_name":"elasticsearch","nodes":{}}

This is using ES 0.19.8, debian on 2.6.32-5-amd64, 6GB heap with
mlockall, otherwise default settings.

Cheers.

On Sep 29, 8:52 am, Shay Banon kim...@gmail.com wrote:

The hot threads is the one that I need, the others don't show why / where CPU is spent.

On Sep 28, 2012, at 8:54 PM, Alpha01 openalph...@gmail.com wrote:

Hi Shay,

Thanks for your help on this, and for creating ElasticSearch :slight_smile: .

localhost:9200/_nodes/hot_threads does not seem to provide lots of details about the nodes/cluster. I'm able to do so by using the following URL instead within any of our nodes:
curl -XGET 'http://localhost:9200/_status?pretty=true'curl-XGET 'http://localhost:9200/_cluster/nodes/_local/stats?pretty=true'

The following gist contains the output of when one of our nodes was having high load/cpu usage affecting the cluster.
https://gist.github.com/3801445

The following gist contains the output of a node with normal ElasticSearch load, at the same time our other node is having extremely high load/cpu usage.
https://gist.github.com/3801427

-Tony

On Friday, September 28, 2012 4:31:05 AM UTC-7, kimchy wrote:
Hi,

First, don't set the new gen heap size to this value, its too high, let it have its default value. I suggest setting ES heap to about 3gb. Upgrade to 0.19.8 (at least), and when you get to high CPU usage, issue:

curl localhost:9200/_nodes/hot_threads

and gist the response in this thread.

On Sep 27, 2012, at 11:53 PM, Alpha01 opena...@gmail.com wrote:

Hello,

We have an Elasticserach cluster configured with 5 nodes. Every so often one random node will get an extremely high load/CPU usage. We thought the issue was garbage collection related because of the frequent Eden space garbage collection occurences the nodes were having. We increased the total heap size of the application to 4 GB and set the oldgen survival to be 1.5 GB. However now, although it is not as refrequent, we still seem to be having sporadic high load/cpu on any node that is part of our Elastsearch cluster. Does anyone now how we can optmize Java/Elasticsearch more in order to avoid having one of our nodes getting extremely high load/CPU usage?

set.default.ES_HEAP_SIZE=4096
-XX:NewSize=1536m

Node details:
CentOS 6
ElasticSearch Version: 0.19.1, JVM: 19.1-b02
6 GB RAM
2.53GHz Intel Xenon (4 cores per node)

Thanks,
-Tony

--

--

--


(Crwe) #7

I upgraded to ES 19.9. The CPU problem persists, but hot_threads
outputs more info now:

25.9% (129.5ms out of 500ms) cpu usage by thread
'elasticsearch[Rage][search][T#12]'
10/10 snapshots sharing following 22 elements

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)

23.0% (114.9ms out of 500ms) cpu usage by thread
'elasticsearch[Rage][search][T#13]'
8/10 snapshots sharing following 23 elements

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:
68)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)
2/10 snapshots sharing following 22 elements

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)

21.4% (106.9ms out of 500ms) cpu usage by thread
'elasticsearch[Rage][search][T#11]'
5/10 snapshots sharing following 23 elements

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:
79)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)

 2/10 snapshots sharing following 23 elements

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:
68)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)
2/10 snapshots sharing following 32 elements
sun.nio.ch.NativeThread.current(Native Method)
sun.nio.ch.NativeThreadSet.add(Unknown Source)
sun.nio.ch.FileChannelImpl.read(Unknown Source)
org.apache.lucene.store.NIOFSDirectory
$NIOFSIndexInput.readInternal(NIOFSDirectory.java:162)

org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:
270)

org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:
40)
org.apache.lucene.store.DataInput.readVInt(DataInput.java:107)

org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:
217)

org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:153)

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:
62)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)
unique snapshot

org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:77)

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:
60)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)

On Sep 29, 11:33 am, Crwe tester.teste...@gmail.com wrote:

I'm experiencing similar problems. The ES log is full of lines like

[2012-09-29 10:33:53,561][WARN ][monitor.jvm ]
[Gargouille] [gc][ParNew][29047][10329] duration [6.7s], collections
[1]/[6.8s], total [6.7s]/[1.2h], memory [3.6gb]->[3.7gb]/[5.9gb],
all_pools {[Code Cache] [6.7mb]->[6.7mb]/[48mb]}{[Par Eden Space]
[3.6mb]->[4.8mb]/[66.5mb]}{[Par Survivor Space] [6.3mb]->[7mb]/[8.3mb]}
{[CMS Old Gen] [3.6gb]->[3.7gb]/[5.9gb]}{[CMS Perm Gen] [37.3mb]->[37.3mb]/[82mb]}

[2012-09-29 10:34:00,145][WARN ][monitor.jvm ]
[Gargouille] [gc][ParNew][29048][10330] duration [6.4s], collections
[1]/[6.6s], total [6.4s]/[1.2h], memory [3.7gb]->[3.7gb]/[5.9gb],
all_pools {[Code Cache] [6.7mb]->[6.7mb]/[48mb]}{[Par Eden Space]
[4.8mb]->[4.5mb]/[66.5mb]}{[Par Survivor Space] [7mb]->[7.7mb]/[8.3mb]}
{[CMS Old Gen] [3.7gb]->[3.7gb]/[5.9gb]}{[CMS Perm Gen] [37.3mb]->[37.3mb]/[82mb]}

[2012-09-29 10:34:03,826][WARN ][monitor.jvm ]
[Gargouille] [gc][ParNew][29049][10331] duration [3.6s], collections
[1]/[3.6s], total [3.6s]/[1.2h], memory [3.7gb]->[3.8gb]/[5.9gb],
all_pools {[Code Cache] [6.7mb]->[6.7mb]/[48mb]}{[Par Eden Space]
[4.5mb]->[4.5mb]/[66.5mb]}{[Par Survivor Space] [7.7mb]->[8.3mb]/
[8.3mb]}{[CMS Old Gen] [3.7gb]->[3.8gb]/[5.9gb]}{[CMS Perm Gen]
[37.3mb]->[37.3mb]/[82mb]}

Otis recommended running jstat, so here are the results (from about
the same period as the ES log above, though not necessarily exactly
the same):

$ sudo jstat -gcutil -t 1478 1000
Timestamp S0 S1 E O P YGC YGCT
FGC FGCT GCT
31343.4 77.58 76.91 98.58 53.45 61.68 10320 4493.539
38 3.010 4496.549
31344.6 77.58 76.91 98.58 53.57 61.68 10320 4493.539
38 3.010 4496.549
31345.6 77.58 66.52 96.67 53.69 61.68 10321 4502.456
38 3.010 4505.466
31346.6 77.58 66.52 96.67 53.78 61.68 10321 4502.456
38 3.010 4505.466
31347.6 77.58 72.28 96.67 53.92 61.68 10321 4502.456
38 3.010 4505.466
31348.6 77.58 72.28 96.67 54.06 61.68 10321 4502.456
38 3.010 4505.466
31349.6 77.58 72.28 96.67 54.14 61.68 10321 4502.456
38 3.010 4505.466
31350.6 77.58 72.28 96.67 54.27 61.68 10321 4502.456
38 3.010 4505.466
31351.6 77.58 72.28 96.67 54.41 61.68 10321 4502.456
38 3.010 4505.466
31352.7 77.58 72.28 96.67 54.49 61.68 10321 4502.456
38 3.010 4505.466
31353.6 63.96 72.28 99.51 54.67 61.68 10322 4510.387
38 3.010 4513.397
31354.6 63.96 72.28 99.51 54.80 61.68 10322 4510.387
38 3.010 4513.397
31355.6 63.96 72.28 99.51 54.94 61.68 10322 4510.387
38 3.010 4513.397
31356.6 63.96 72.28 99.51 55.08 61.68 10322 4510.387
38 3.010 4513.397
31357.7 63.96 72.28 99.51 55.21 61.68 10322 4510.387
38 3.010 4513.397
31358.7 85.72 72.28 99.51 55.44 61.68 10322 4510.387
38 3.010 4513.397
31359.6 85.72 72.28 99.51 55.52 61.68 10322 4510.387
38 3.010 4513.397
31360.7 91.94 72.28 99.51 55.58 61.68 10322 4510.387
38 3.010 4513.397
31361.7 91.94 77.34 98.80 55.82 61.68 10323 4518.343
38 3.010 4521.353
31362.7 91.94 77.34 98.80 55.82 61.68 10323 4518.343
38 3.010 4521.353
31363.6 91.94 84.77 98.80 56.03 61.68 10323 4518.343
38 3.010 4521.353
31364.7 91.94 84.77 98.80 56.09 61.68 10323 4518.343
38 3.010 4521.353
31365.7 91.94 84.77 98.80 56.22 61.68 10323 4518.343
38 3.010 4521.353
31366.7 91.94 93.89 98.80 56.41 61.68 10323 4518.343
38 3.010 4521.353
31367.7 91.94 100.00 98.80 56.56 61.68 10323 4518.343
38 3.010 4521.353
31368.8 91.94 100.00 98.80 56.68 61.68 10323 4518.343
38 3.010 4521.353
31369.7 81.31 100.00 96.70 56.87 61.68 10324 4525.871
38 3.010 4528.882
31370.8 81.31 100.00 96.70 57.09 61.68 10324 4525.871
38 3.010 4528.882
31371.8 81.31 100.00 96.70 57.31 61.68 10324 4525.871
38 3.010 4528.882
31372.7 81.31 100.00 96.70 57.65 61.68 10324 4525.871
38 3.010 4528.882
31373.8 81.31 100.00 95.35 58.04 61.68 10325 4530.330
38 3.010 4533.340
31374.8 81.31 100.00 95.35 58.24 61.68 10325 4530.330
38 3.010 4533.340
31375.8 81.31 100.00 95.35 58.43 61.68 10325 4530.330
38 3.010 4533.340
31376.8 81.31 100.00 95.35 58.49 61.68 10325 4530.330
38 3.010 4533.340
31377.8 81.31 100.00 95.35 58.63 61.68 10325 4530.330
38 3.010 4533.340

$ curl -XGET 'localhost:9200/_nodes/hot_threads'
{"ok":true,"cluster_name":"elasticsearch","nodes":{}}

This is using ES 0.19.8, debian on 2.6.32-5-amd64, 6GB heap with
mlockall, otherwise default settings.

Cheers.

On Sep 29, 8:52 am, Shay Banon kim...@gmail.com wrote:

The hot threads is the one that I need, the others don't show why / where CPU is spent.

On Sep 28, 2012, at 8:54 PM, Alpha01 openalph...@gmail.com wrote:

Hi Shay,

Thanks for your help on this, and for creating ElasticSearch :slight_smile: .

localhost:9200/_nodes/hot_threads does not seem to provide lots of details about the nodes/cluster. I'm able to do so by using the following URL instead within any of our nodes:
curl -XGET 'http://localhost:9200/_status?pretty=true'curl-XGET'http://localhost:9200/_cluster/nodes/_local/stats?pretty=true'

The following gist contains the output of when one of our nodes was having high load/cpu usage affecting the cluster.
https://gist.github.com/3801445

The following gist contains the output of a node with normal ElasticSearch load, at the same time our other node is having extremely high load/cpu usage.
https://gist.github.com/3801427

-Tony

On Friday, September 28, 2012 4:31:05 AM UTC-7, kimchy wrote:
Hi,

First, don't set the new gen heap size to this value, its too high, let it have its default value. I suggest setting ES heap to about 3gb. Upgrade to 0.19.8 (at least), and when you get to high CPU usage, issue:

curl localhost:9200/_nodes/hot_threads

and gist the response in this thread.

On Sep 27, 2012, at 11:53 PM, Alpha01 opena...@gmail.com wrote:

Hello,

We have an Elasticserach cluster configured with 5 nodes. Every so often one random node will get an extremely high load/CPU usage. We thought the issue was garbage collection related because of the frequent Eden space garbage collection occurences the nodes were having. We increased the total heap size of the application to 4 GB and set the oldgen survival to be 1.5 GB. However now, although it is not as refrequent, we still seem to be having sporadic high load/cpu on any node that is part of our Elastsearch cluster. Does anyone now how we can optmize Java/Elasticsearch more in order to avoid having one of our nodes getting extremely high load/CPU usage?

set.default.ES_HEAP_SIZE=4096
-XX:NewSize=1536m

Node details:
CentOS 6
ElasticSearch Version: 0.19.1, JVM: 19.1-b02
6 GB RAM
2.53GHz Intel Xenon (4 cores per node)

Thanks,
-Tony

--

--

--


(Shay Banon) #8

Hey, please gist those responses, otherwise its hard to analyze them.

It seems like you don't have enough memory allocated to ES to do the type of faceting/sorting you are trying to do. Either allocate more memory to each ES instance (but make sure not to allocate more than ~1/2 of the machine memory, as a rule of thumb), or, increase the number of nodes in the cluster.

On Sep 30, 2012, at 7:16 AM, Crwe tester.testerus@gmail.com wrote:

I upgraded to ES 19.9. The CPU problem persists, but hot_threads
outputs more info now:

25.9% (129.5ms out of 500ms) cpu usage by thread
'elasticsearch[Rage][search][T#12]'
10/10 snapshots sharing following 22 elements

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)

23.0% (114.9ms out of 500ms) cpu usage by thread
'elasticsearch[Rage][search][T#13]'
8/10 snapshots sharing following 23 elements

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:
68)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)
2/10 snapshots sharing following 22 elements

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)

21.4% (106.9ms out of 500ms) cpu usage by thread
'elasticsearch[Rage][search][T#11]'
5/10 snapshots sharing following 23 elements

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:
79)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)

2/10 snapshots sharing following 23 elements

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:
68)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)
2/10 snapshots sharing following 32 elements
sun.nio.ch.NativeThread.current(Native Method)
sun.nio.ch.NativeThreadSet.add(Unknown Source)
sun.nio.ch.FileChannelImpl.read(Unknown Source)
org.apache.lucene.store.NIOFSDirectory
$NIOFSIndexInput.readInternal(NIOFSDirectory.java:162)

org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:
270)

org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:
40)
org.apache.lucene.store.DataInput.readVInt(DataInput.java:107)

org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:
217)

org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:153)

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:
62)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)
unique snapshot

org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:77)

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLoader.java:
60)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringFieldData.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringFieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollector.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)

On Sep 29, 11:33 am, Crwe tester.teste...@gmail.com wrote:

I'm experiencing similar problems. The ES log is full of lines like

[2012-09-29 10:33:53,561][WARN ][monitor.jvm ]
[Gargouille] [gc][ParNew][29047][10329] duration [6.7s], collections
[1]/[6.8s], total [6.7s]/[1.2h], memory [3.6gb]->[3.7gb]/[5.9gb],
all_pools {[Code Cache] [6.7mb]->[6.7mb]/[48mb]}{[Par Eden Space]
[3.6mb]->[4.8mb]/[66.5mb]}{[Par Survivor Space] [6.3mb]->[7mb]/[8.3mb]}
{[CMS Old Gen] [3.6gb]->[3.7gb]/[5.9gb]}{[CMS Perm Gen] [37.3mb]->[37.3mb]/[82mb]}

[2012-09-29 10:34:00,145][WARN ][monitor.jvm ]
[Gargouille] [gc][ParNew][29048][10330] duration [6.4s], collections
[1]/[6.6s], total [6.4s]/[1.2h], memory [3.7gb]->[3.7gb]/[5.9gb],
all_pools {[Code Cache] [6.7mb]->[6.7mb]/[48mb]}{[Par Eden Space]
[4.8mb]->[4.5mb]/[66.5mb]}{[Par Survivor Space] [7mb]->[7.7mb]/[8.3mb]}
{[CMS Old Gen] [3.7gb]->[3.7gb]/[5.9gb]}{[CMS Perm Gen] [37.3mb]->[37.3mb]/[82mb]}

[2012-09-29 10:34:03,826][WARN ][monitor.jvm ]
[Gargouille] [gc][ParNew][29049][10331] duration [3.6s], collections
[1]/[3.6s], total [3.6s]/[1.2h], memory [3.7gb]->[3.8gb]/[5.9gb],
all_pools {[Code Cache] [6.7mb]->[6.7mb]/[48mb]}{[Par Eden Space]
[4.5mb]->[4.5mb]/[66.5mb]}{[Par Survivor Space] [7.7mb]->[8.3mb]/
[8.3mb]}{[CMS Old Gen] [3.7gb]->[3.8gb]/[5.9gb]}{[CMS Perm Gen]
[37.3mb]->[37.3mb]/[82mb]}

Otis recommended running jstat, so here are the results (from about
the same period as the ES log above, though not necessarily exactly
the same):

$ sudo jstat -gcutil -t 1478 1000
Timestamp S0 S1 E O P YGC YGCT
FGC FGCT GCT
31343.4 77.58 76.91 98.58 53.45 61.68 10320 4493.539
38 3.010 4496.549
31344.6 77.58 76.91 98.58 53.57 61.68 10320 4493.539
38 3.010 4496.549
31345.6 77.58 66.52 96.67 53.69 61.68 10321 4502.456
38 3.010 4505.466
31346.6 77.58 66.52 96.67 53.78 61.68 10321 4502.456
38 3.010 4505.466
31347.6 77.58 72.28 96.67 53.92 61.68 10321 4502.456
38 3.010 4505.466
31348.6 77.58 72.28 96.67 54.06 61.68 10321 4502.456
38 3.010 4505.466
31349.6 77.58 72.28 96.67 54.14 61.68 10321 4502.456
38 3.010 4505.466
31350.6 77.58 72.28 96.67 54.27 61.68 10321 4502.456
38 3.010 4505.466
31351.6 77.58 72.28 96.67 54.41 61.68 10321 4502.456
38 3.010 4505.466
31352.7 77.58 72.28 96.67 54.49 61.68 10321 4502.456
38 3.010 4505.466
31353.6 63.96 72.28 99.51 54.67 61.68 10322 4510.387
38 3.010 4513.397
31354.6 63.96 72.28 99.51 54.80 61.68 10322 4510.387
38 3.010 4513.397
31355.6 63.96 72.28 99.51 54.94 61.68 10322 4510.387
38 3.010 4513.397
31356.6 63.96 72.28 99.51 55.08 61.68 10322 4510.387
38 3.010 4513.397
31357.7 63.96 72.28 99.51 55.21 61.68 10322 4510.387
38 3.010 4513.397
31358.7 85.72 72.28 99.51 55.44 61.68 10322 4510.387
38 3.010 4513.397
31359.6 85.72 72.28 99.51 55.52 61.68 10322 4510.387
38 3.010 4513.397
31360.7 91.94 72.28 99.51 55.58 61.68 10322 4510.387
38 3.010 4513.397
31361.7 91.94 77.34 98.80 55.82 61.68 10323 4518.343
38 3.010 4521.353
31362.7 91.94 77.34 98.80 55.82 61.68 10323 4518.343
38 3.010 4521.353
31363.6 91.94 84.77 98.80 56.03 61.68 10323 4518.343
38 3.010 4521.353
31364.7 91.94 84.77 98.80 56.09 61.68 10323 4518.343
38 3.010 4521.353
31365.7 91.94 84.77 98.80 56.22 61.68 10323 4518.343
38 3.010 4521.353
31366.7 91.94 93.89 98.80 56.41 61.68 10323 4518.343
38 3.010 4521.353
31367.7 91.94 100.00 98.80 56.56 61.68 10323 4518.343
38 3.010 4521.353
31368.8 91.94 100.00 98.80 56.68 61.68 10323 4518.343
38 3.010 4521.353
31369.7 81.31 100.00 96.70 56.87 61.68 10324 4525.871
38 3.010 4528.882
31370.8 81.31 100.00 96.70 57.09 61.68 10324 4525.871
38 3.010 4528.882
31371.8 81.31 100.00 96.70 57.31 61.68 10324 4525.871
38 3.010 4528.882
31372.7 81.31 100.00 96.70 57.65 61.68 10324 4525.871
38 3.010 4528.882
31373.8 81.31 100.00 95.35 58.04 61.68 10325 4530.330
38 3.010 4533.340
31374.8 81.31 100.00 95.35 58.24 61.68 10325 4530.330
38 3.010 4533.340
31375.8 81.31 100.00 95.35 58.43 61.68 10325 4530.330
38 3.010 4533.340
31376.8 81.31 100.00 95.35 58.49 61.68 10325 4530.330
38 3.010 4533.340
31377.8 81.31 100.00 95.35 58.63 61.68 10325 4530.330
38 3.010 4533.340

$ curl -XGET 'localhost:9200/_nodes/hot_threads'
{"ok":true,"cluster_name":"elasticsearch","nodes":{}}

This is using ES 0.19.8, debian on 2.6.32-5-amd64, 6GB heap with
mlockall, otherwise default settings.

Cheers.

On Sep 29, 8:52 am, Shay Banon kim...@gmail.com wrote:

The hot threads is the one that I need, the others don't show why / where CPU is spent.

On Sep 28, 2012, at 8:54 PM, Alpha01 openalph...@gmail.com wrote:

Hi Shay,

Thanks for your help on this, and for creating ElasticSearch :slight_smile: .

localhost:9200/_nodes/hot_threads does not seem to provide lots of details about the nodes/cluster. I'm able to do so by using the following URL instead within any of our nodes:
curl -XGET 'http://localhost:9200/_status?pretty=true'curl-XGET'http://localhost:9200/_cluster/nodes/_local/stats?pretty=true'

The following gist contains the output of when one of our nodes was having high load/cpu usage affecting the cluster.
https://gist.github.com/3801445

The following gist contains the output of a node with normal ElasticSearch load, at the same time our other node is having extremely high load/cpu usage.
https://gist.github.com/3801427

-Tony

On Friday, September 28, 2012 4:31:05 AM UTC-7, kimchy wrote:
Hi,

First, don't set the new gen heap size to this value, its too high, let it have its default value. I suggest setting ES heap to about 3gb. Upgrade to 0.19.8 (at least), and when you get to high CPU usage, issue:

curl localhost:9200/_nodes/hot_threads

and gist the response in this thread.

On Sep 27, 2012, at 11:53 PM, Alpha01 opena...@gmail.com wrote:

Hello,

We have an Elasticserach cluster configured with 5 nodes. Every so often one random node will get an extremely high load/CPU usage. We thought the issue was garbage collection related because of the frequent Eden space garbage collection occurences the nodes were having. We increased the total heap size of the application to 4 GB and set the oldgen survival to be 1.5 GB. However now, although it is not as refrequent, we still seem to be having sporadic high load/cpu on any node that is part of our Elastsearch cluster. Does anyone now how we can optmize Java/Elasticsearch more in order to avoid having one of our nodes getting extremely high load/CPU usage?

set.default.ES_HEAP_SIZE=4096
-XX:NewSize=1536m

Node details:
CentOS 6
ElasticSearch Version: 0.19.1, JVM: 19.1-b02
6 GB RAM
2.53GHz Intel Xenon (4 cores per node)

Thanks,
-Tony

--

--

--

--


(Crwe) #9

Thanks Shay. The log (and bigdesk) suggested ES was using 3.8GB out of
HEAP_SIZE=6GB, with field cache total of 2.9GB (I guess that's out of
the 3.8GB?). I thought 2GB free would be a healthy margin; any rules
of thumb on how much "free" RAM to leave inside the JVM?

Btw the machine has 12GB RAM total, so 6GB for java falls within your
~1/2 recommendation.

Cheers.

On Sep 30, 7:07 pm, Shay Banon kim...@gmail.com wrote:

Hey, please gist those responses, otherwise its hard to analyze them.

It seems like you don't have enough memory allocated to ES to do the type of faceting/sorting you are trying to do. Either allocate more memory to each ES instance (but make sure not to allocate more than ~1/2 of the machine memory, as a rule of thumb), or, increase the number of nodes in the cluster.

On Sep 30, 2012, at 7:16 AM, Crwe tester.teste...@gmail.com wrote:

I upgraded to ES 19.9. The CPU problem persists, but hot_threads
outputs more info now:

25.9% (129.5ms out of 500ms) cpu usage by thread
'elasticsearch[Rage][search][T#12]'
10/10 snapshots sharing following 22 elements

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringField Data.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringF ieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringF ieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapField DataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollec tor.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(Abstrac tFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector .java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexS earcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java :
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQue ry(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)

23.0% (114.9ms out of 500ms) cpu usage by thread
'elasticsearch[Rage][search][T#13]'
8/10 snapshots sharing following 23 elements

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLo ader.java:
68)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringField Data.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringF ieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringF ieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapField DataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollec tor.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(Abstrac tFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector .java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexS earcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java :
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQue ry(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)
2/10 snapshots sharing following 22 elements

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringField Data.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringF ieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringF ieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapField DataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollec tor.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(Abstrac tFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector .java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexS earcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java :
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQue ry(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
java.lang.Thread.run(Unknown Source)

21.4% (106.9ms out of 500ms) cpu usage by thread
'elasticsearch[Rage][search][T#11]'
5/10 snapshots sharing following 23 elements

org.elasticsearch.index.field.data.support.FieldDataLoader.load(FieldDataLo ader.java:
79)

org.elasticsearch.index.field.data.strings.StringFieldData.load(StringField Data.java:
90)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringF ieldDataType.java:
56)

org.elasticsearch.index.field.data.strings.StringFieldDataType.load(StringF ieldDataType.java:
34)

org.elasticsearch.index.field.data.FieldData.load(FieldData.java:111)

org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapField DataCache.cache(AbstractConcurrentMapFieldDataCache.java:
129)

org.elasticsearch.search.facet.terms.strings.TermsStringOrdinalsFacetCollec tor.doSetNextReader(TermsStringOrdinalsFacetCollector.java:
128)

org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(Abstrac tFacetCollector.java:
81)

org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector .java:
67)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)

org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexS earcher.java:
195)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)

org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)

org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java :
286)

org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQue ry(SearchServiceTransportAction.java:
174)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction.executeQuery(TransportSearchDfsQueryThenFetchAction.java:
148)

org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction
$AsyncAction$2.run(TransportSearchDfsQueryThenFetchAction.java:135)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)

...

read more »

--


(Artem Grinblat) #10

понедельник, 1 октября 2012 г., 0:46:56 UTC+4 пользователь Crwe написал:

Thanks Shay. The log (and bigdesk) suggested ES was using 3.8GB out of
HEAP_SIZE=6GB, with field cache total of 2.9GB (I guess that's out of
the 3.8GB?). I thought 2GB free would be a healthy margin; any rules
of thumb on how much "free" RAM to leave inside the JVM?

Have you seen this: cache
http://blog.sematext.com/2012/05/17/elasticsearch-cache-usage/ ?

--


(system) #11