ELK 7.8.0 two node cluster "At least one primary shard for the index [.security-7] is unavailable"

Hello,

We are on a two node ELK cluster and noticing the below error suddennly. As the error says .security index seems to be corrupted. Is there anyway to restore the old state?

Due to this kibana is not loading, none of the user authentication works as well.

ELK : Version 7.8.0

    [2021-01-13T17:05:27,359][INFO ][o.e.x.s.a.AuthenticationService] [elekpelk01] Authentication of [elastic] was terminated by realm [reserved] - failed to authenticate user [elastic]
[2021-01-13T17:05:29,290][ERROR][o.e.x.s.a.e.ReservedRealm] [elekpelk01] failed to retrieve password hash for reserved user [elastic]
org.elasticsearch.action.UnavailableShardsException: at least one primary shard for the index [.security-7] is unavailable
        at org.elasticsearch.xpack.security.support.SecurityIndexManager.getUnavailableReason(SecurityIndexManager.java:181) ~[x-pack-security-7.8.0.jar:7.8.0]
        at org.elasticsearch.xpack.security.authc.esnative.NativeUsersStore.getReservedUserInfo(NativeUsersStore.java:525) [x-pack-security-7.8.0.jar:7.8.0]
        at org.elasticsearch.xpack.security.authc.esnative.ReservedRealm.getUserInfo(ReservedRealm.java:224) [x-pack-security-7.8.0.jar:7.8.0]
        at org.elasticsearch.xpack.security.authc.esnative.ReservedRealm.doAuthenticate(ReservedRealm.java:99) [x-pack-security-7.8.0.jar:7.8.0]
        at org.elasticsearch.xpack.security.authc.support.CachingUsernamePasswordRealm.authenticateWithCache(CachingUsernamePasswordRealm.java:167) [x-pack-security-7.8.0.jar:7.8.0]
        at org.elasticsearch.xpack.security.authc.support.CachingUsernamePasswordRealm.authenticate(CachingUsernamePasswordRealm.java:104) [x-pack-security-7.8.0.jar:7.8.0]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$consumeToken$15(AuthenticationService.java:449) [x-pack-security-7.8.0.jar:7.8.0]
        at org.elasticsearch.xpack.core.common.IteratingActionListener.run(IteratingActionListener.java:102) [x-pack-core-7.8.0.jar:7.8.0]

The likely cause of this issue is that you removed nodes from the cluster in an unsafe way.
If there are nodes that used to be in part of the cluster, but have been taken offline, then you could try restarting them.

Otherwise you can restore from a snapshot if you have one.

If all else fails you will probably need to delete that index and start again.

However, you really should try and work out how you ended up in this situation, otherwise it's likely to happen again.

2 Likes

Thank you for your reply. We tried restarting which did not help. Based on your suggestions here: Accidentally deleted .security index for x-pack

We followed the below steps.
We tried to reset the passwords for the built in users so that the .security-7 index will be recreated so that the authentication issue can be fixed :

# docker exec -u elasticsearch -it elasticsearch bin/elasticsearch-setup-passwords interactive
Failed to authenticate user 'elastic' against https://10.10.206.110:9200/_security/_authenticate?pretty
Possible causes include:
 * The password for the 'elastic' user has already been changed on this cluster
 * Your elasticsearch node is running against a different keystore
   This tool used the keystore at /usr/share/elasticsearch/config/elasticsearch.keystore
ERROR: Failed to verify bootstrap password

Tried adding the bootstrap user on both nodes :

docker exec -u elasticsearch -it elasticsearch bin/elasticsearch-keystore add bootstrap.password

and restarted elastcisearch 7.8.0 on both the nodes in the cluster :

docker restart elasticsearch

again getting the same error when resetting the built in user passwords

# docker exec -u elasticsearch -it elasticsearch bin/elasticsearch-setup-passwords interactive
Failed to authenticate user 'elastic' against https://10.180.206.110:9200/_security/_authenticate?pretty
Possible causes include:
 * The password for the 'elastic' user has already been changed on this cluster
 * Your elasticsearch node is running against a different keystore
   This tool used the keystore at /usr/share/elasticsearch/config/elasticsearch.keystore
ERROR: Failed to verify bootstrap password

Could you please let us know for a possible solution ?

We got it fixed with the below steps. Adding in case if it helps someone.

Create below files inside elasticsearch container.

/usr/share/elasticsearch/config/users_roles
/usr/share/elasticsearch/config/users

Execute below command to create a restore user.

docker exec -u elasticsearch -it elasticsearch bin/elasticsearch-users useradd restore_user -p xxxxxxx -r superuser

Delete the corrupt index using new user on ONLY 1 node

curl -u restore_user -k -X DELETE "https://localhost:9200/.security-*"

Restart elasticsearch on both nodes

docker restart elasticsearch

Thanks

4 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.