currently having a problem with our elasticsearch setup. We're using a java app, using the tcp client to connect to elasticsearch nodes and we've recently had a requirement which makes us requests a lot of fields in the _source list filter of our query.
this means that the json response coming back from elastic has increased significantly. When we test this with 20 requests per second, we get a pretty decent average response time, however, if we bring the total of requests per second up to 30, we start seeing the response times degrading to the magnitude of 7s.
When analyzing the response at the "took" information, we can see that the query itself, is only taking 100ms to 150ms.
We're currently optimizing our cluster and machine specs, but it looks as if the issue might be related with either the network (even thought we see these latency values within the same private network), or maybe it's related with the number of tcp connections being open (even though I can't really see many, which leads me to believe they're being reused). Or maybe something to do with serialization/deserialization overhead (we're using Jackson).
Any thoughts on how to debug this?