ElasticSearch: Configuring keep alive time for scaling threads

I am facing Java Out Of Memory issue while successive index operations. I have a custom ES plugin, which makes an API call, so for each document an API call is made and a new thread is being created for that API call. Following is the traceback of exception

java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached at java.lang.Thread.start0(Native Method) ~[?:?] at java.lang.Thread.start(Thread.java:801) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:939) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1007) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?] at java.lang.Thread.run(Thread.java:832) [?:?]

The root cause being ElasticSearch is trying to create threads more than the configured LimitNPROC count, which is 4096 in my case. I was able to get rid of this error by increasing the LimitNPROC count to 10000 in this file /usr/lib/systemd/system/elasticsearch.service

I increased the threads but that is not the solution as I am facing the same when documents to index increased.
I observed the threads are not being released immediately after the indexing operation is done for the document as the rate of creation of threads is too high compared to the rate of thread release

My question is,

I came across scaling type threads in ES docs which are used for indexing with keep-alive configuration. Is there any possibility to reduce the keep-alive time of the threads (i am assuming scaling threads are being created for these API calls) and is it good to reduce it?

please mention if any further details are required