We're having difficulty with RestClient against a large 700 node ES cluster. Under a moderate search load (47 concurrent searches) we see
extensive use of I/O Dispatchers (based on the default I/O Recator configs in the httpclient lib you use
apparent queuing of requests
We don't understand the DEFAULT_MAX_CONN_TOTAL vs. DEFAULT_MAX_CONN_PER_ROUTE settings you default to in RestClientBuilder.java. I've been scouring httpclient/core source code to understand.
Main question: when we create a new RestClient against a SLOW ES cluster (can take hours to retrieve 250k docs) - as we call performRequest() over scroll ids - are we creating new connections?
We can't recreate the problem at our office under smaller clusters - unless we can "slow down" ES per query - is there some secret thing we can do to make ES take minutes per query/page/scroll vs. ms/seconds just to keep these connections open longer to test the JVM/threading issues?
I understand it a bit more - i enabled org.apache.http debug - I can see that we clearly didn't understand the httpconfig and the defaults used in your RestClient (max of 10 per route, 30 total). Plus, the interaction with the default IO Reactor in all this is horrible - we're at http 4x, so the default is to create NCORES * 2 "I/O dispatcher" threads. For us, thats 160 per connection. With just 47 concurrent searches we have nearly 7500 threads created to do IO! Plus, with the max route/total limits, we're not even sure what's happening - how are they queuing, etc. Appreciate any light.
No, the REST client should be re-using connections.
That sounds very wrong. Are you creating a client per search? You should only normally have a single instance of the client in existence, and it should normally live until your application shuts down.
it is wrong - our app is a GIS app which supports dynamic searches (user driven) of thousands of different layers - all from about 10 types of connections - ES, PostgreSQL, MemSQL, REST/WFS, MongoDB, etc.). So, our task manager handles queing and working off searches from a java thread pool. by default, it creates a NEW restClient for each search. Similar to how we create a new PostgreSQL/PostGIS connection per search (which is fine/expected for that kind of data source).
Thanks, we're building into the code now a capability to keep a static restClient, and set all those configs.
I'm not sure that's good either, at least it wasn't when I last used PostgreSQL. Each connection to PostgreSQL spawns its own backend process which can itself take many milliseconds and definitely runs into problems with high numbers of concurrent clients. There are workarounds like pgBouncer, but IMO that's a bit of a hack compared with using an in-app connection pool.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.