I am using elastic search 0.90.5 within an akka application. I am using the elastic search java client to make the elastic search calls. I've been using this set up for a while with no problems... until now
As part of a new feature, for each request made to the application, a new actor (think of it like a new thread, but way cheaper) is spawned. It in turn makes a call to elastic search for some data. The hits returned from this query are pretty big, >1900 documents are returned. Across all actors spawned, the same elastic search TransportClient is used, i.e. the client is a singleton across the application.
On a single request, the elastic search client takes around 500ms to retrieve the results and produce the SearchResponse object. However, when the application is put under minimal load (3 enquiries per second for 30 seconds), the time taken by the elastic search client to retrieve the results increases substantially, i.e. under load each request takes 5000-6000 milliseconds. As more load is introduced, the app dies very quickly because the elastic search client becomes a bottleneck.
I've followed the pattern above for other elastic search queries and had no problems. The difference here is the number of hits returned and (i think) the ability of the client to deserialise the information into a SearchResponse.
To make the query, I am using the SearchRequestBuilder and the TransportClient. When i run the query in isolation using the elastic search head plugin, the query takes a couple of hundred milliseconds consistently to respond. I have tried using the scan/scroll approach to get the results in 100 hit chunks but it has had no affect, i.e. times still increase rapidly under load. I have even tried to create a new Transport client for each request (not advised I know) and that too exhibited the same problem.
I'm a bit at a loss of how to resolve this issue and would appreciate any advice/help in addressing it. As the request is made in realtime it needs to scale under load. If it takes 600ms to get the query and deserialise, that is fine. But under load, if it increase to 2+ seconds, then it's a problem.