I have a question/concern about the workings of Elasticsearch and the GPDR. In the GPDR you have the right to be forgotten. Let's say I am a server owner and I have some personal information, which the owner has requested to remove. When I mark the data for removal, the data isn't immediately removed from the disk, but removed at searchtime later. Can this be a problem when looking at GPDR compliance?
Technically if you mark something as removed, then turn off the cluster for 1 year, you could still read the data from the disk which was supposed to be removed. Where can I find specific information about this topic?
I'm unsure if I've looked at this correctly. Can anyone have a look at the question?
The following is not legal advice; I am not a lawyer and certainly not your lawyer, so please seek proper help before taking any action based on this response.
The question of "soft" data deletion and its interaction with the GDPR is not unique to Elasticsearch. Even if you deleted some files from disk you may still be able to recover their contents at a later date. It is also usually impractical to remove all traces of a data subject from historical backups. The GDPR doesn't really define what it means to "erase" some data, and there is a school of thought that it's sufficient simply to have a policy that forbids recovery techniques like the one you describe.
Thank you for your extensive answer. I'll do the additional research to ensure we comply with the law.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.