Hi everybody.
I have strange results that multi-process search too slow.
I have tested that 35 billion documents in 1 index vs 3.5 billion documents in 10 index. I used 7.6 version elasticsearch. and 2 data nodes (52 core, 62GB memory).
- search query to 35 billion document index(50 shards)
- search query to 3.5 billion documents in 10 index(5 shards per index) by a comma-separated like ["index_A", "index_B", ... ]
- search query to 3.5 billion documents in 10 index(5 shards per index) by python client with pool.starmap() using 10 processes. each process query to each index
I expected the 3rd result to be ten times better than the 1st result. because of the smaller is faster. but the three results almost the same. I can't understand that result.
- Can you explain why this happens?
- I profiled the API. 'build_scorer' used almost elapsed time. as the number of processes increased, the 'build_scorer' also increased at 3rd experiment. why?