Degrading performance, weird 100%CPU


(AlexeyV) #1

Hi there, i've got a 3 node cluster (5 shard +1 replica ) index, which is
1GB in size.
I've got some GEO docs (2.5 million) in my index. Documents contain
geo_point data, which i query for using geo_bbox filter with a specific
bounding box. For a given test, i have a query that returns 0 rows and
takes 40ms to complete and barely noticable CPU usage at all. When i crank
up service calls (50 concurrent threads), i notice all my 3 nodes start
hitting 100% cpu, and query degrades to 200ms , 500ms+ up to few minutes.
This test is done via hitting of an application service which uses Java SDK
to issue a query using TransportClient, which is configured to use all 3
nodes.

My question is, why am i getting a degraded performance for the same query?
I would assume it should if anything just pull it out of cache (note result
returned by query is intentionally zero documents). I'm suspecting theres
another factor that overloads my ES cluster, perhaps something outside of
query that i am missing?

Any advice is greatly appretiated.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #2

Here's an extract of doc: http://www.elasticsearch.org/guide/reference/query-dsl/geo-bounding-box-filter/

The result of the filter is not cached by default. The _cache can be set to true to cache the result of the filter. This is handy when the same bounding box parameters are used on several (many) other queries. Note, the process of caching the first execution is higher when caching (since it needs to satisfy different queries).

Does it help?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 sept. 2013 à 00:18, Alexey Volochenko alexey.volochenko@gmail.com a écrit :

Hi there, i've got a 3 node cluster (5 shard +1 replica ) index, which is 1GB in size.
I've got some GEO docs (2.5 million) in my index. Documents contain geo_point data, which i query for using geo_bbox filter with a specific bounding box. For a given test, i have a query that returns 0 rows and takes 40ms to complete and barely noticable CPU usage at all. When i crank up service calls (50 concurrent threads), i notice all my 3 nodes start hitting 100% cpu, and query degrades to 200ms , 500ms+ up to few minutes. This test is done via hitting of an application service which uses Java SDK to issue a query using TransportClient, which is configured to use all 3 nodes.

My question is, why am i getting a degraded performance for the same query? I would assume it should if anything just pull it out of cache (note result returned by query is intentionally zero documents). I'm suspecting theres another factor that overloads my ES cluster, perhaps something outside of query that i am missing?

Any advice is greatly appretiated.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(AlexeyV) #3

Thanks David! i will try this out and post results.

On Saturday, August 31, 2013 10:26:24 PM UTC-7, David Pilato wrote:

Here's an extract of doc:
http://www.elasticsearch.org/guide/reference/query-dsl/geo-bounding-box-filter/

The result of the filter is not cached by default. The _cache can be set
to true to cache the result of the filter. This is handy when the same
bounding box parameters are used on several (many) other queries. Note, the
process of caching the first execution is higher when caching (since it
needs to satisfy different queries).

Does it help?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 sept. 2013 à 00:18, Alexey Volochenko <alexey.v...@gmail.com<javascript:>>
a écrit :

Hi there, i've got a 3 node cluster (5 shard +1 replica ) index, which is
1GB in size.
I've got some GEO docs (2.5 million) in my index. Documents contain
geo_point data, which i query for using geo_bbox filter with a specific
bounding box. For a given test, i have a query that returns 0 rows and
takes 40ms to complete and barely noticable CPU usage at all. When i crank
up service calls (50 concurrent threads), i notice all my 3 nodes start
hitting 100% cpu, and query degrades to 200ms , 500ms+ up to few minutes.
This test is done via hitting of an application service which uses Java SDK
to issue a query using TransportClient, which is configured to use all 3
nodes.

My question is, why am i getting a degraded performance for the same
query? I would assume it should if anything just pull it out of cache (note
result returned by query is intentionally zero documents). I'm suspecting
theres another factor that overloads my ES cluster, perhaps something
outside of query that i am missing?

Any advice is greatly appretiated.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(AlexeyV) #4

So i've done more testing, and seems like caching didnt help, though even
if it did i kind of doubt it would be a solution due to nature of the
bounding box randomness + extra hit on first query.
I ended up spinning up beefier server with more CPU capacity to address the
performance degregation. Will continue looking for ways to reduce cpu use.

On Tuesday, September 3, 2013 3:10:46 PM UTC-7, Alexey Volochenko wrote:

Thanks David! i will try this out and post results.

On Saturday, August 31, 2013 10:26:24 PM UTC-7, David Pilato wrote:

Here's an extract of doc:
http://www.elasticsearch.org/guide/reference/query-dsl/geo-bounding-box-filter/

The result of the filter is not cached by default. The _cache can be set
to true to cache the result of the filter. This is handy when the same
bounding box parameters are used on several (many) other queries. Note, the
process of caching the first execution is higher when caching (since it
needs to satisfy different queries).

Does it help?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 sept. 2013 à 00:18, Alexey Volochenko alexey.v...@gmail.com a
écrit :

Hi there, i've got a 3 node cluster (5 shard +1 replica ) index, which is
1GB in size.
I've got some GEO docs (2.5 million) in my index. Documents contain
geo_point data, which i query for using geo_bbox filter with a specific
bounding box. For a given test, i have a query that returns 0 rows and
takes 40ms to complete and barely noticable CPU usage at all. When i crank
up service calls (50 concurrent threads), i notice all my 3 nodes start
hitting 100% cpu, and query degrades to 200ms , 500ms+ up to few minutes.
This test is done via hitting of an application service which uses Java SDK
to issue a query using TransportClient, which is configured to use all 3
nodes.

My question is, why am i getting a degraded performance for the same
query? I would assume it should if anything just pull it out of cache (note
result returned by query is intentionally zero documents). I'm suspecting
theres another factor that overloads my ES cluster, perhaps something
outside of query that i am missing?

Any advice is greatly appretiated.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #5