I have the following situation: Logstash with a JDBC input connector and an Elasticsearch output connector. The data is loaded initially into the ES, meaning the indices which are filled do not exist prior to the Logstash load. Logstash creates the indices based on mapping templates. Logstash and ES are in version 7.17.0. The ES and Logstash run in Kubernetes.
This scenario worked perfectly fine until recently.
Issue
The indexing works perfectly fine until around 1 mio documents. Then indexing of new documents slows down rapidly and the number of deleted documents oscillates immensely. It increases a lot and drops frequently.
I am aware that documents are deleted as part of an update operation and this was previously already the case. But the deleted documents count mostly increased and did not oscillate to that extend. Before the issue occurred the situation at the end of the load was around 8 mio docs.count and around 3 mio as docs.deleted. At the moment the number stays at around 1.2 mio.
The scenario is deployed on two environments and only one of the environments shows this behavior. Both have the same source and Logstash configuration. The environments differ in their ES instance. Both instances are configured equally.
The disk space is fine and there are no error in the ES log. I also can confirm that the Logstash is running fine and is constantly sending bulk requests to the ES.
Time series of the metrics
TIME 1230:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open myindex LLC8qxSWTWyO1U25Olljlg 1 1 1182676 166056 642.3mb 642.3mb
TIME 1240:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open myindex LLC8qxSWTWyO1U25Olljlg 1 1 1182676 533339 946.9mb 946.9mb
TIME 1300:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open myindex LLC8qxSWTWyO1U25Olljlg 1 1 1182676 349747 701.9mb 701.9mb
TIME 1400:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open myindex LLC8qxSWTWyO1U25Olljlg 1 1 1182678 467651 1gb 1gb
TIME 1430:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open myindex LLC8qxSWTWyO1U25Olljlg 1 1 1182678 693906 1gb 1gb