Elastic Search High CPU 100%

I've a problem with elastic search that it consumes 100% of cpu when I do
search with faceting for a huge index around 5g its size and have more that
1Million docs inside.

Attached is file for hot thread dump when cpu reaches 100% , moreover the
request takes too much time before I get take response from elastic more
than 5 seconds delay!.

Main configurations I've are:

ES_HEAP_SIZE=5024
ES_MIN_MEM=5024
ES_MAX_MEM=5024

index.number_of_shards: 20

ES version 0.9.2 and installed on amazon m1 large instance , and installed
on EBS volume with 150GB size.

Please help me in getting over this problem.

Thanks,

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I notice you use JDBC river. For best performance while faceting, it may
help to disable the JDBC river.

Thanks,

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

thanks Jörg for ur help, actually I use jdbc for indexing from mysql db.
how can I disable it when faceting?

thanks

On Tuesday, August 13, 2013 12:10:37 PM UTC+3, Jörg Prante wrote:

I notice you use JDBC river. For best performance while faceting, it may
help to disable the JDBC river.

Thanks,

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Just remove the river, before faceting

curl -XDELETE 'http://localhost:9200/_river/my_jdbc_river_name/'

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

does it remove the whole index ?

On Tuesday, August 13, 2013 12:29:43 PM UTC+3, Jörg Prante wrote:

Just remove the river, before faceting

curl -XDELETE 'http://localhost:9200/_river/my_jdbc_river_name/'

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

No, it stops river activity, and removes the river metadata from ES.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey,

it seems you are using some scripts that fetch stored fields, is that
possible ie. you are using the _source in the script or something like
that? Can you maybe provide your query or one of them using scripts?

This will almost guarantee that your machine is spinning 100% CPU all the
time :slight_smile:

simon

On Tuesday, August 13, 2013 11:04:36 AM UTC+2, Anas Jaghoub wrote:

I've a problem with Elasticsearch that it consumes 100% of cpu when I do
search with faceting for a huge index around 5g its size and have more that
1Million docs inside.

Attached is file for hot thread dump when cpu reaches 100% , moreover the
request takes too much time before I get take response from elastic more
than 5 seconds delay!.

Main configurations I've are:

ES_HEAP_SIZE=5024
ES_MIN_MEM=5024
ES_MAX_MEM=5024

index.number_of_shards: 20

ES version 0.9.2 and installed on amazon m1 large instance , and installed
on EBS volume with 150GB size.

Please help me in getting over this problem.

Thanks,

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jörg, I've deleted it , and the same thing it keeps loading for more than
5 seconds with cpu 100%.
attached the hot threads log after deleting jdbc river.

On Tuesday, August 13, 2013 12:04:36 PM UTC+3, Anas Jaghoub wrote:

I've a problem with Elasticsearch that it consumes 100% of cpu when I do
search with faceting for a huge index around 5g its size and have more that
1Million docs inside.

Attached is file for hot thread dump when cpu reaches 100% , moreover the
request takes too much time before I get take response from elastic more
than 5 seconds delay!.

Main configurations I've are:

ES_HEAP_SIZE=5024
ES_MIN_MEM=5024
ES_MAX_MEM=5024

index.number_of_shards: 20

ES version 0.9.2 and installed on amazon m1 large instance , and installed
on EBS volume with 150GB size.

Please help me in getting over this problem.

Thanks,

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Simonw,

Yep, I use _source for faceting by terms, since I need to group by the
whole terms in the field. Below is the query I use:

{"query":{"bool":{"must":[[{"range":{"posts.record_insert_date":{"gte":"2008-01-01"}}}]]}},"facets":{"posts|record_insert_date":{"date_histogram":{"interval":"month","field":"posts.record_insert_date"}},"categories_new|name":{"terms":{"script_field":"_source.posts.categories_new.name","all_terms":true}},"countries|name":{"terms":{"script_field":"_source.posts.countries.name","all_terms":true}}}}

On Tuesday, August 13, 2013 1:46:40 PM UTC+3, simonw wrote:

Hey,

it seems you are using some scripts that fetch stored fields, is that
possible ie. you are using the _source in the script or something like
that? Can you maybe provide your query or one of them using scripts?

This will almost guarantee that your machine is spinning 100% CPU all the
time :slight_smile:

simon

On Tuesday, August 13, 2013 11:04:36 AM UTC+2, Anas Jaghoub wrote:

I've a problem with Elasticsearch that it consumes 100% of cpu when I do
search with faceting for a huge index around 5g its size and have more that
1Million docs inside.

Attached is file for hot thread dump when cpu reaches 100% , moreover the
request takes too much time before I get take response from elastic more
than 5 seconds delay!.

Main configurations I've are:

ES_HEAP_SIZE=5024
ES_MIN_MEM=5024
ES_MAX_MEM=5024

index.number_of_shards: 20

ES version 0.9.2 and installed on amazon m1 large instance , and
installed on EBS volume with 150GB size.

Please help me in getting over this problem.

Thanks,

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

hey why don't you just use

{ "terms" : { "field" : "posts.countries.name", "all_terms" : true} }
instead

simon

On Tuesday, August 13, 2013 1:27:04 PM UTC+2, Anas Jaghoub wrote:

Hi Simonw,

Yep, I use _source for faceting by terms, since I need to group by the
whole terms in the field. Below is the query I use:

{"query":{"bool":{"must":[[{"range":{"posts.record_insert_date":{"gte":"2008-01-01"}}}]]}},"facets":{"posts|record_insert_date":{"date_histogram":{"interval":"month","field":"posts.record_insert_date"}},"categories_new|name":{"terms":{"script_field":"_
source.posts.categories_new.name
","all_terms":true}},"countries|name":{"terms":{"script_field":"_
source.posts.countries.name","all_terms":true}}}}

On Tuesday, August 13, 2013 1:46:40 PM UTC+3, simonw wrote:

Hey,

it seems you are using some scripts that fetch stored fields, is that
possible ie. you are using the _source in the script or something like
that? Can you maybe provide your query or one of them using scripts?

This will almost guarantee that your machine is spinning 100% CPU all the
time :slight_smile:

simon

On Tuesday, August 13, 2013 11:04:36 AM UTC+2, Anas Jaghoub wrote:

I've a problem with Elasticsearch that it consumes 100% of cpu when I
do search with faceting for a huge index around 5g its size and have more
that 1Million docs inside.

Attached is file for hot thread dump when cpu reaches 100% , moreover
the request takes too much time before I get take response from elastic
more than 5 seconds delay!.

Main configurations I've are:

ES_HEAP_SIZE=5024
ES_MIN_MEM=5024
ES_MAX_MEM=5024

index.number_of_shards: 20

ES version 0.9.2 and installed on amazon m1 large instance , and
installed on EBS volume with 150GB size.

Please help me in getting over this problem.

Thanks,

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.