Hello,
we updated our system from ES 2.4 to ES 7.16 some time ago. Now we recognized that there is a huge indexing performance drop when it comes to a very special use case that we have.
The use case is as follows: We update many documents and after each single document we need to do a search which reflects the just updated document. Therefore, we do an update
, refresh
, search
cycle for each single document.
I know that this is not optimal because it prevents us from doing bulk updates and also causes many refreshes. And compared to our other use cases where we are able to do bulk updates, this performs bad. This is documented and we know that very well.
However, when we were on ES 2.4, we could live with the performance. But after the upgrade to ES 7.16, performance of that particular use case was suddenly 10x slower. So, if possible, the goal is to at least get close to the performance that we had with ES 2.4
Here some numbers of our system:
- Index size 75 GB
- ~ 16 million documents
- ES cluster with 1 node
- 1 primary shard, 0 replicas
We recognized that it is the call to refresh
that takes most of the time (95%) compared to update
and search
. We also recognized that it heavily depends on the amount of data in the index (still 30% performance drop on an empty index compared to ES 2.4).
Because of that, we re-indexed with 6 primary shards. This improved performance a lot already but it is still 5x slower.
CPU is utilized almost completely during the run, so some CPU-intensive stuff must be going on there.
We also tried on a 3-node cluster where each node is as strong as the one in the single-node setup. With 6 primary shards per node and 0 replica shards, the performance is even worse compared to the single-node setup.
Questions:
- What could be the bottleneck here and can this be tuned somehow? (ES-setting, system setting, hardware improvement)
- Can this be profiled somehow to find out the bottleneck?
- Should we try even more primary shards in order to improve performance here or will this only lead to over-sharding?
- Why does ES 2.4 perform so much better in that regard with just one 75 GB shard?
- Why is the performance worse when more nodes are added without replicas? I expected that the load is somehow distributed between the nodes.
Here a screenshot of Kibanas advanced index dashboard during the run of the use case. Most interesting part is between 11:58 and 12:00.
Thanks in advance for your help!
Kind Regards,
Daniel