es reference doc has this sentence:
The http mechanism is completely asynchronous in nature, meaning that there is no blocking thread waiting for a response.
but when I use http bulk index API, it will wait for each doc index status. so what does the asynchronous in doc mean?
It refers to the server-side where Elasticsearch uses non-blocking I/O to manage incoming connections. When a request comes in, it is asynchronously handed off to a worker thread which means that the receiving thread can immediately return to service another client request. Eventually, the worker thread will complete its work and will execute a callback to get the response back out to the client. This is what is meant by "asynchronous" and this is how Elasticsearch is able to concurrently handle a large number of requests without blocking request threads.
This is in complete opposition to a server that uses blocking I/O to manage incoming connections. In a typical "prefork" model (e.g., Apache mpm_prefork_module
), a fixed number of threads (or child processes) will be preforked and then wait for client requests to come in. When a client request comes in, one of these workers will service the request from end to end (request to response). If all of these preforked workers are busy servicing client requests, too bad.
Think of a typical bank with a fixed number of teller windows; when every teller window is occupied, everyone else in line is blocked from even having their request started on. If instead a bank used a non-blocking model, tellers would take requests from clients, hand off the request to someone else and immediately move on to the next client. At some point, a worker would complete a client request, hand it back to a teller who would then forward the completed request (cash, receipt, etc.) to the appropriate client.
This is what it means. I was getting confused by all the other non-blocking stuff. All that other stuff I said about non-blocking internal requests is true, but its not what that sentence was talking about.
Its a reference to the way elasticsearch is written. When elasticsearch
sends requests to itself around the cluster it never blocks any of its
threads. Instead it returns the thread to the thread pool and checks a new
one out when the response comes back. Requests can flow around the cluster
pretty fluidly. Searches typically run two requests against the data nodes.
Index requests hit the data node with the primary shard first then are
pushed to the replica shards, etc. The node that actually handles the http
is the same way.
Its important that elasticsearch works this way because it means it can
handle lots of requests. Threads are expensive and blocking them is
wasteful. Elasticsearch does lots of other important stuff to scale and the
guide does call all of them out but it calls this out because its one of
the more important things.
got it. thanks a lot