How to Improved the concurrency bulk thread numbers use rpc analyzer


(Liuguangmin) #1

Hello,everyone:
when I develop a rpc analyzer use a remote server, I am a client and I communication with the server through thrift. my environment as follow:my elasticsearch is 5.5.2,have 5 nodes,java jdk 1.8,,my cpu cores :16 ,so I find every node which is in my cluster only have the largest bulk thread number is 16(The maximum size for this pool is 1 + # of available processors.) .I use the default ,so the the largest bulk thread number is 16. I also have find with the increase of shared ,the bulk active thread is grow,but the max is also 16.if I have too many shards ,I find it also Influence the BM25 score.because the BM25 has a factor is IDF,AND the IDF is also calculation in every shard.
(I) have some questions as follow:(1) Because my use rpc analyzer,so the network overhead is expensive,so I can increase the number of concurrent thread quantity, but the bulk is only have 16,and i think it also relation to the shards number. althonght I user thrift and I also Initialization the thread pool szie in my thrift configuration is 2000,but I also find the pool size active also relation to shared number.For me ,how to increase concurrent quantity which is use in rpc analyzer? there is anonther method?AND why the elasticsearch bulk thread size can add to 2000,or more bigger.this is a bottleneck I think?
(2) because I add my shard number is 80(16*5),and the index rate is 2000/s,but the shared is too many ,so i use shrink to reszie the shard,but I find a problem is the index must allocation in a node,why the shrink unsupport the index cross node. AND the future can support shrink a index with difference nodes.
THX


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.