Bulk index slowing down as index size increases

Phew, I think I finally found the root cause of the indexing speed slowing down. It's ICU4J transliteration, and the fact that after about 13.2M documents we start to have a lot of Chinese data.

I asked in a new post how to avoid duplicate transliterations (I am assuming the use of icu_transform in one property but with multiple fields results in running transliteration multiple times for the same identical text): ICU transform filters slowing down indexing: how avoid duplicate transliterations?

Thanks!

PS. @Christian_Dahlqvist feel free to continue with the good insights in the new post! :slight_smile:

1 Like