I tried to concurrently index lots of small documents to index with 1 shard and 0 replicas. And I saw that changing number of threads from 1 to 3 gives performance gain only about 1.5x while I expected at least 2.5x. CPU usage of elasticsearch process increased 3x times (contention?). Is this situation normal? Is there a way to improve shard concurrency?
I'm using elasticsearch-2.3.5 with default settings on stable debian Linux 3.16.0-4-amd64.
$ java -version
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
merge.scheduler.max_thread_count: 1 (tried 3 - without significant changes)
My app is multithreaded java, each indexer thread using its own TransportClient, performing bulk index requests with size of bulk request = 10 000. App is not a bottleneck: its CPU usage is about 35%.
I/O is not a bottleneck: according to iostat most of the time my SSD usage is about 10%.
Memory is not a problem: 3+ Gb of cached mem, swap is disabled. Elasticsearch process virt: 8.5g, res: 1.8g.
I know that sharding will help, but I'm also tryin to maximise per-shard performance (e.g. for transient bulk-load cases).
Index stat after indexing using 3 threads:
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open main 1 0 26849945 0 5.4gb 5.4gb
Performance comparison of 1-thread and 3-thread cases attached (again, expecting >2.5). X axis is seconds.