Two questions
- How to use reindex API with zero downtime.
- Could the index setting of
index.soft_deletes_enabled
be non-dynamic and replace reindex API with settingindex.soft_deletes_enabled
to true for a exist index witch needssoft_deletes
enabled.
where are Questions from
-
when i read the document of Elasticsearch, there is a word come up with me that "Soft deletes can only be configured at index creation, if you want to enabled it for a exist index, you must reindex your data into a new index with soft deletes enabled".
-
I have read some paper of Lucene about the feature of
soft deletes
. And I have do some test with Lucene that it work well when using hard delete and soft delete in the mean time. So I think hard delete is not conflicted with soft delete in Lucene. -
In the Elasticsearch level,
index.soft_deletes_enabled
is aIndexScope.Final
setting. And I ask one of authors of Elasticsearch, he tells me that the reason why made it final is in order to siplify model. -
So I do some test with Elasticsearch in v6.8.3, download the souce code and modify setting of
index.soft_deletes.enabled
as non-dynamic which look like following:
Setting.boolSetting("index.soft_deletes.enabled", false, Property.IndexScope);
-
It work well for me that enabled
index.soft_deletes.enabled
for a exist closed index. When I reopen the index, it still work well in CURD and so on. -
As we know, the reindex API is just reindex snapshot of exist index when it triggers. It will result in inconsistent data between old and new index when reindex then updating data in the old index. So here comes with the first question, is there a best practice to use reindex API with zero downtime.
-
if not, why not for non-dynamic setting of
index.soft_deletes.enabled
to avoid reindex in this situation. Here comes with the second question I mentions forward. -
Could anyone explain the two questions in principle. Thanks very much.