Our scenario below.
Self-Managed ES 3-node cluster on AWS EC2. How can we recover this data loss described below? Can your team of experts help?
- The loss of all user data was confirmed by looking at elasticsearch and seeing that the User index is empty
- There is a lambda job that runs every hour that backs up all database tables that has not operated properly since September 2019 (every table except the user table has its data backed up to S3)
- It remains unclear why this is the case. Nothing has changed with the job nor with the User index in elasticsearch, so something else has changed.
- This fix should be priority #1 after the User table is back to normal.
- We checked the lambda job's logs to find the period when the user table became empty: sometime between 6:38 and 7:38 UTC (log lines below)
- We checked the elasticsearch logs to see if there was any visible cause for the data loss. There was none.
- We did find that the index was recreated at 10:50 UTC. Since it is a new index, the size of the index is 0 bytes, as expected. (log lines below)
- There is no smoking gun for why the User index was deleted
Lambda Job Logs (evidence of lost)
- 6:38UTC - 802751f5-46bd-4b17-b56a-f20d98ac79e1 Snapshotting users_2018-03-23 - count: 1112408, size: 575540262
- 7:38UTC - 7ac10956-18e8-4f67-83ee-e6fa9102a112 Snapshotting users_2018-03-23 - count: 0, size: 1590
Elasticsearch Logs (Index Recreated)
- [2020-03-25 10:50:06,595][INFO ][cluster.metadata ] [ip-10-2-3-179] [users_2018-03-23] creating index, cause [api], templates , shards /, mappings [role, user, token]
- [2020-03-25 10:50:07,248][INFO ][cluster.routing.allocation] [ip-10-2-3-179] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[users_2018-03-23], [users_2018-03-23], [users_2018-03-23], [users_2018-03-23], [users_2018-03-23]] ...]).
- [2020-03-25 10:50:08,239][INFO ][cluster.routing.allocation] [ip-10-2-3-179] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[users_2018-03-23], [users_2018-03-23], [users_2018-03-23]] ...]).