Elasticsearch num processors/bulk threads detection

How does elasticsearch detect the number of available processors ? Each node in our cluster has 48 cores but the bulk thread pool size is 32, just trying to find out how elasticsearch is arriving at the number.
Below is the output from curl -XGET localhost:9200/_nodes/ for one of the nodes.

"cpu" : {
  "vendor" : "Intel",
  "model" : "Xeon",
  "mhz" : 2501,
  "total_cores" : 48,
  "total_sockets" : 1,
  "cores_per_socket" : 32,
  "cache_size_in_bytes" : 30720
},

What you see is an information from the sigar library, which is known to return wrong values https://github.com/hyperic/sigar/issues/63

but these values were never used for ES internal configuration.

Sigar has been removed since May 2015

so I wonder what Elasticsearch version you use.

ES uses Runtime.getRuntime().availableProcessors() which depends on the JVM implementation.

Use Java 8+ and Elasticsearch 2+ for best results.

Thanks ! Thats interesting, but I am seeing the same behavior on 1.4 and 2.3.3 (with java 8u20) as well. Also when I create a simple java program and run it on the machine the output of Runtime.getRuntime().availableProcessors() is 48.

I looked into the code, Elasticsearch has a maximum limit of 32 for processor-based thread pools.

See org.elasticsearch.common.util.concurrent.EsExecutors

    /**
     * Returns the number of processors available but at most <tt>32</tt>.
     */
    public static int boundedNumberOfProcessors(Settings settings) {
        /* This relates to issues where machines with large number of cores
         * ie. >= 48 create too many threads and run into OOM see #3478
         * We just use an 32 core upper-bound here to not stress the system
         * too much with too many created threads */
        int defaultValue = Math.min(32, Runtime.getRuntime().availableProcessors());
        try {
            defaultValue = Integer.parseInt(System.getProperty(DEFAULT_SYSPROP));
        } catch (Throwable ignored) {}
        return settings.getAsInt(PROCESSORS, defaultValue);
    }

The reason for the limit is explained at

While I share the diagnosis, I do not follow why a local limit of 32 can ensure to prevent "thread explosions", since Elasticsearch has six thread pools which are proportional to the available processor count: INDEX, BULK, GET, SEARCH, SUGGEST, PERCOLATE. Plus, there are five thread pools that are 50% of the processor count, but not more than 5 or 10 threads: LISTENER, FLUSH, REFREH, WARMER, SNAPSHOT. And finally, there are two pools with double size of the processor count: FETCH_SHARD_STARTED and FETCH_SHARD_STORE. That can result in 632 + 35 + 210 + 22*32 = 192 + 15 + 20 + 128 = 335 threads in the ES pools on 32+ core machines. Plus, there are threads running by Netty. Anyway, we can conclude there are still enough ES/Netty threads in most situations that can be scheduled with maximum efficiency on any existing CPU hardware by the operating system.

Users with machines of >32 cores and a very, very specific or exotic workload pattern might want a deeper insight to the question if exceeding the pool size limit can also increase efficiency. The power of a single action type execution on a node is bound by a hard-coded maximum of 32 threads, but I doubt if that boundary is really a problem or can be measured at all.

My machines have a core count of 36 and 40 but maybe if I find time it is possible to run benchmarks with mixed search/index workload, enabling/disabling the hard-coded limit.

3 Likes

Thanks a lot !