At least one primary shard for the index [.security-7] is unavailable issue

sahere37 · October 9, 2022, 4:49am

Hi
I have a two-node cluster with IP "0.0.0.1" , "0.0.0.2". One of my VMs "0.0.0.2" suddenly stopped, and when I start it, the cluster health was RED. Then I restart both VMs again and below message has been found in their log and I could not login in https://0.0.0.1:9200 and https://0.0.0.2:9200.

[2022-10-08T08:40:07,003][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [node-1] collector [cluster_stats] failed to collect data
org.elasticsearch.action.UnavailableShardsException: at least one primary shard for the index [.security-7] is unavailable
	at org.elasticsearch.xpack.security.support.SecurityIndexManager.getUnavailableReason(SecurityIndexManager.java:147) ~[?:?]
	at org.elasticsearch.xpack.security.authc.esnative.NativeUsersStore.getUserCount(NativeUsersStore.java:167) ~[?:?]
	at org.elasticsearch.xpack.security.authc.esnative.NativeRealm.lambda$usageStats$1(NativeRealm.java:56) ~[?:?]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:136) ~[elasticsearch-7.16.1.jar:7.16.1]
	at org.elasticsearch.xpack.security.authc.support.CachingUsernamePasswordRealm.lambda$usageStats$5(CachingUsernamePasswordRealm.java:249) ~[?:?]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:136) ~[elasticsearch-7.16.1.jar:7.16.1]
	at org.elasticsearch.xpack.core.security.authc.Realm.usageStats(Realm.java:140) ~[?:?]
	at org.elasticsearch.xpack.security.authc.support.CachingUsernamePasswordRealm.usageStats(CachingUsernamePasswordRealm.java:247) ~[?:?]
	at org.elasticsearch.xpack.security.authc.esnative.NativeRealm.usageStats(NativeRealm.java:56) ~[?:?]
	at org.elasticsearch.xpack.security.authc.Realms.usageStats(Realms.java:388) ~[?:?]
	at org.elasticsearch.xpack.security.SecurityFeatureSet.usage(SecurityFeatureSet.java:165) ~[?:?]
	at org.elasticsearch.xpack.core.action.TransportXPackUsageAction.lambda$masterOperation$2(TransportXPackUsageAction.java:86) ~[?:?]
	at org.elasticsearch.xpack.core.common.IteratingActionListener.onResponse(IteratingActionListener.java:135) ~[?:?]
	at org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47) ~[elasticsearch-7.16.1.jar:7.16.1]
	at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62) ~[elasticsearch-7.16.1.jar:7.16.1]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777) ~[elasticsearch-7.16.1.jar:7.16.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-7.16.1.jar:7.16.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]

Then based on this link I did below steps to resolve issue:

1- define a new user
elasticsearch-users useradd restore_user -p xxxxxxx -r superuser
2- delete corrupt index:

curl -u restore_user -k -X DELETE "https://localhost:9200/.security-*"

3- restart all nodes

when I did these steps, I was able to login to elasticsearch node which I defined new user, by new user. but all previous roles and users have been vanished and I had to define them manauly agarin.
How can I handle this issue without the need of defining users and roles again?
also, the cluster health is RED and there are two unassigned shard in kibana monitoring but in the indices part, the status of all indices are green.

Regards

warkolm · October 10, 2022, 11:44pm

When you delete the index you delete all the existing users and roles.

We would need to figure out why your nodes "suddenly stopped" to try to understand what caused the index to be unrecoverable. Sharing some more logs would help.

sahere37 · October 11, 2022, 5:26am

Hi, thanks for your answer. The VM has been stopped so the elastic node stopped too.
how can we prevent such disastrous?

warkolm · October 11, 2022, 5:30am

How were the hosts stopped exactly? Did Elasticsearch have the chance to gracefully shutdown?

sahere37 · October 11, 2022, 5:41am

actually the VM which the software group gave us was just for a limited time and after that time the VM automatically stopped but elasticsearch has been defined as windows service so it is expected that stopped correctly.
What is the best way to stop Elasticsearch to prevent this problem. And, when this issue happened, is there any way to resolve above error except deleting security index?

sahere37 · October 11, 2022, 5:46am

Also, is there a way to make a daily backup of roles and users and when this issue happened just insert the backup to inhibit the definition of users and roles manually ?

sahere37 · November 2, 2022, 8:46am

Hi,
I will be so appreciated if you consider my comments. Thanks.

warkolm · November 2, 2022, 10:07pm

You can take snapshots of your indices and then restore them, yes.

sahere37 · November 7, 2022, 12:31pm

I mean taking backup of defined roles and users, not indices. Can we take backup of roles and users?

system · December 5, 2022, 12:32pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
At least one primary shard for the index [.security-7] is unavailable Elasticsearch	5	5643	October 26, 2020
UnavailableShardsException: at least one primary shard for the index [.security-7] is unavailable Elasticsearch	6	1665	December 18, 2023
ELK 7.8.0 two node cluster "At least one primary shard for the index [.security-7] is unavailable" Elasticsearch elastic-stack-security	4	11734	February 11, 2021
ELK8: at least one primary shard for the index [.security-profile-8] is unavailable Kibana elastic-stack-security	2	447	June 23, 2023
Unable to start elasticsearch 7.8 container with a password Elasticsearch elastic-stack-security	3	1337	September 14, 2020

At least one primary shard for the index [.security-7] is unavailable issue

Related topics