I have a number of Elasticsearch indices for millions of objects. The problem arises only with one of them that contains around 5 million objects.
When I am trying to rebuild the Elasticsearch index (I am using django-elasticsearch-dsl), the following occurs:
- The document count in the index remains 0.
- The disk usage in my PostgreSQL database is growing rapidly. Namely, the Temp folder is growing.
- If I do nothing, the rebuilding stops when 100% of the disk is used.
- If I kill the index rebuild command manually, the disk usage still continues growing until I restart PostgreSQL .
- ps aux shows that the DECLARE CURSOR statement is always there (i.e. is not completed) up until the crash.
- In the long run, the Elasticsearch index is there but with 0 documents.
This index had existed before and I didn't have any problems with it. I do not think I have changed any of its settings, so the behaviour is even more strange. I have no problems with disk space usage and endure no problems when rebuilding an index that is 20 times larger.
The settings of the problem index are the following:
- number_of_shards: 1
- number_of_replicas: 0
- "mappings":
{"properties":{"keyword":{"type":"text"}}}
So, I have only one field called 'keyword', I'm using the standard analyzer and do no extra work over the 'keyword' content when building the index.
I have tried renaming the index; blocking all signals from Elasticsearch to PosgreSQL; disabling auto_refresh on Elasticsearch; deleting, building, and filling instead of just rebuilding; rebooting the website as a whole; etc.
What might be causing this behaviour and how can I deal with it?