The dangers are manifold. First, you said it, the memory allocation. This means the sender of the queued threads considers the job done while it is not. Second, latency is all when it comes to performance. Hanging jobs in long queues increase the latency of a system from a client perspective when synchronized responses are required. ES is asynchronously designed for that reason to gain low latency but this does not come for free. Increased memory usage comes with increased garbage collections and unnecessary interaction with the OS on I/O layer. The key word is "back pressure" : imagine lots of slow clients that sit there and wait for responses on requests. Scan/scroll is such an example where large amount of data is involved and ES may produce higher data rates than can be consumed. ES must keep open all the references to such slow clients, and this can escalate until the system steps over. Slow clients can kill very fast servers.
One strategy is to bring the server/client into a dynamic balance. For example, this could be done by reactive streams - bidirectional streams that can adjust themselves regarding to the current capacity of the client or the server.