Degrading performance, weird 100%CPU

AlexeyV · August 31, 2013, 10:18pm

Hi there, i've got a 3 node cluster (5 shard +1 replica ) index, which is
1GB in size.
I've got some GEO docs (2.5 million) in my index. Documents contain
geo_point data, which i query for using geo_bbox filter with a specific
bounding box. For a given test, i have a query that returns 0 rows and
takes 40ms to complete and barely noticable CPU usage at all. When i crank
up service calls (50 concurrent threads), i notice all my 3 nodes start
hitting 100% cpu, and query degrades to 200ms , 500ms+ up to few minutes.
This test is done via hitting of an application service which uses Java SDK
to issue a query using TransportClient, which is configured to use all 3
nodes.

My question is, why am i getting a degraded performance for the same query?
I would assume it should if anything just pull it out of cache (note result
returned by query is intentionally zero documents). I'm suspecting theres
another factor that overloads my ES cluster, perhaps something outside of
query that i am missing?

Any advice is greatly appretiated.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

dadoonet · September 1, 2013, 5:26am

Here's an extract of doc: http://www.elasticsearch.org/guide/reference/query-dsl/geo-bounding-box-filter/

The result of the filter is not cached by default. The _cache can be set to true to cache the result of the filter. This is handy when the same bounding box parameters are used on several (many) other queries. Note, the process of caching the first execution is higher when caching (since it needs to satisfy different queries).

Does it help?

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 sept. 2013 à 00:18, Alexey Volochenko alexey.volochenko@gmail.com a écrit :

Hi there, i've got a 3 node cluster (5 shard +1 replica ) index, which is 1GB in size.
I've got some GEO docs (2.5 million) in my index. Documents contain geo_point data, which i query for using geo_bbox filter with a specific bounding box. For a given test, i have a query that returns 0 rows and takes 40ms to complete and barely noticable CPU usage at all. When i crank up service calls (50 concurrent threads), i notice all my 3 nodes start hitting 100% cpu, and query degrades to 200ms , 500ms+ up to few minutes. This test is done via hitting of an application service which uses Java SDK to issue a query using TransportClient, which is configured to use all 3 nodes.

My question is, why am i getting a degraded performance for the same query? I would assume it should if anything just pull it out of cache (note result returned by query is intentionally zero documents). I'm suspecting theres another factor that overloads my ES cluster, perhaps something outside of query that i am missing?

Any advice is greatly appretiated.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

AlexeyV · September 3, 2013, 10:10pm

Thanks David! i will try this out and post results.

On Saturday, August 31, 2013 10:26:24 PM UTC-7, David Pilato wrote:

Here's an extract of doc:
Elasticsearch Platform — Find real-time answers at scale | Elastic

The result of the filter is not cached by default. The _cache can be set
to true to cache the result of the filter. This is handy when the same
bounding box parameters are used on several (many) other queries. Note, the
process of caching the first execution is higher when caching (since it
needs to satisfy different queries).

Does it help?

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 sept. 2013 à 00:18, Alexey Volochenko <alexey.v...@gmail.com<javascript:>>
a écrit :

Hi there, i've got a 3 node cluster (5 shard +1 replica ) index, which is
1GB in size.
I've got some GEO docs (2.5 million) in my index. Documents contain
geo_point data, which i query for using geo_bbox filter with a specific
bounding box. For a given test, i have a query that returns 0 rows and
takes 40ms to complete and barely noticable CPU usage at all. When i crank
up service calls (50 concurrent threads), i notice all my 3 nodes start
hitting 100% cpu, and query degrades to 200ms , 500ms+ up to few minutes.
This test is done via hitting of an application service which uses Java SDK
to issue a query using TransportClient, which is configured to use all 3
nodes.

My question is, why am i getting a degraded performance for the same
query? I would assume it should if anything just pull it out of cache (note
result returned by query is intentionally zero documents). I'm suspecting
theres another factor that overloads my ES cluster, perhaps something
outside of query that i am missing?

Any advice is greatly appretiated.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

AlexeyV · September 4, 2013, 11:31pm

So i've done more testing, and seems like caching didnt help, though even
if it did i kind of doubt it would be a solution due to nature of the
bounding box randomness + extra hit on first query.
I ended up spinning up beefier server with more CPU capacity to address the
performance degregation. Will continue looking for ways to reduce cpu use.

On Tuesday, September 3, 2013 3:10:46 PM UTC-7, Alexey Volochenko wrote:

Thanks David! i will try this out and post results.

On Saturday, August 31, 2013 10:26:24 PM UTC-7, David Pilato wrote:

Here's an extract of doc:
Elasticsearch Platform — Find real-time answers at scale | Elastic

The result of the filter is not cached by default. The _cache can be set
to true to cache the result of the filter. This is handy when the same
bounding box parameters are used on several (many) other queries. Note, the
process of caching the first execution is higher when caching (since it
needs to satisfy different queries).

Does it help?

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 sept. 2013 à 00:18, Alexey Volochenko alexey.v...@gmail.com a
écrit :

Hi there, i've got a 3 node cluster (5 shard +1 replica ) index, which is
1GB in size.
I've got some GEO docs (2.5 million) in my index. Documents contain
geo_point data, which i query for using geo_bbox filter with a specific
bounding box. For a given test, i have a query that returns 0 rows and
takes 40ms to complete and barely noticable CPU usage at all. When i crank
up service calls (50 concurrent threads), i notice all my 3 nodes start
hitting 100% cpu, and query degrades to 200ms , 500ms+ up to few minutes.
This test is done via hitting of an application service which uses Java SDK
to issue a query using TransportClient, which is configured to use all 3
nodes.

My question is, why am i getting a degraded performance for the same
query? I would assume it should if anything just pull it out of cache (note
result returned by query is intentionally zero documents). I'm suspecting
theres another factor that overloads my ES cluster, perhaps something
outside of query that i am missing?

Any advice is greatly appretiated.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Degrading performance, weird 100%CPU Elasticsearch	1	302	July 6, 2017
Need help to overcome 100% CPU Elasticsearch	18	14264	May 25, 2017
Elasticsearch performance issue (possibly too large filter cache) Elasticsearch	3	407	July 6, 2017
Random 100% CPU Spikes on an staging cluster Elasticsearch	3	1120	February 20, 2017
Query performance degradation after upgrade to 5.3.1 ( from 1.4.5 ) Elasticsearch	6	821	May 29, 2017

Degrading performance, weird 100%CPU

Any advice is greatly appretiated.

Related topics