Degrading performance, weird 100%CPU


(AlexeyV) #1

Hi there, i've got a 3 node cluster (5 shard +1 replica ) index, which is 1GB in size.
I've got some GEO docs (2.5 million) in my index. Documents contain geo_point data, which i query for using geo_bbox filter with a specific bounding box. For a given test, i have a query that returns 0 rows and takes 40ms to complete and barely noticable CPU usage at all. When i crank up service calls (50 concurrent threads), i notice all my 3 nodes start hitting 100% cpu, and query degrades to 200ms , 500ms+ up to few minutes. This test is done via hitting of an application service which uses Java SDK to issue a query using TransportClient, which is configured to use all 3 nodes.

My question is, why am i getting a degraded performance for the same query? I would assume it should if anything just pull it out of cache (note result returned by query is intentionally zero documents). I'm suspecting theres another factor that overloads my ES cluster, perhaps something outside of query that i am missing?

Any advice is greatly appretiated.


(system) #2