Indices replicas are set to 0 during night? [7.11.2]

mbooh · September 8, 2021, 8:29am

Hi!

I have a strange behavior. During night one of my indices lost a replica setting? Yesterday everything worked fine, they both had 1 replica but this morning it is 0 on one of them meaning when I close one node down I get the "all shards failed" error. How can that be?
I set the replicas with a POST command, can it be specified in the .yaml instead?

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   klaranatetwebutv2_2 QZvcjC9JQhehCiTT7BHxEQ   1   1      48519        17690    526.3mb        263.1mb
green  open   klaranatetwebutv2_1 jKF9YttySPuQ6H8EkiDzpw   1   0      48519          664    163.7mb        163.7mb

And why is there _1 and _2? That i don't really understand as well.

[2021-09-08T10:19:28,603][WARN ][r.suppressed             ] [STHLM-KLARA-04] path: /klaranatetwebutv2_1/_search, params: {typed_keys=true, index=klaranatetwebutv2_1}
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:601) [elasticsearch-7.11.2.jar:7.11.2]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:332) [elasticsearch-7.11.2.jar:7.11.2]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:636) [elasticsearch-7.11.2.jar:7.11.2]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:415) [elasticsearch-7.11.2.jar:7.11.2]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:240) [elasticsearch-7.11.2.jar:7.11.2]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:308) [elasticsearch-7.11.2.jar:7.11.2]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.11.2.jar:7.11.2]
	at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33) [elasticsearch-7.11.2.jar:7.11.2]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732) [elasticsearch-7.11.2.jar:7.11.2]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.11.2.jar:7.11.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: org.elasticsearch.action.NoShardAvailableActionException
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:448) ~[elasticsearch-7.11.2.jar:7.11.2]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:397) [elasticsearch-7.11.2.jar:7.11.2]

Thanks for your help!

/Kristoffer

warkolm · September 8, 2021, 9:46am

Do the logs on the master mention anything about the index during this time?

mbooh · September 8, 2021, 11:39am

I found this in the current master node log from tonight:

[2021-09-08T02:24:00,343][INFO ][o.e.x.m.MlDailyMaintenanceService] [STHLM-KLARA-04] triggering scheduled [ML] maintenance tasks
[2021-09-08T02:24:00,358][INFO ][o.e.x.m.a.TransportDeleteExpiredDataAction] [STHLM-KLARA-04] Deleting expired data
[2021-09-08T02:24:00,405][INFO ][o.e.x.m.j.r.UnusedStatsRemover] [STHLM-KLARA-04] Successfully deleted [0] unused stats documents
[2021-09-08T02:24:00,421][INFO ][o.e.x.m.a.TransportDeleteExpiredDataAction] [STHLM-KLARA-04] Completed deletion of expired ML data
[2021-09-08T02:24:00,421][INFO ][o.e.x.m.MlDailyMaintenanceService] [STHLM-KLARA-04] Successfully completed [ML] maintenance task: triggerDeleteExpiredDataTask
[2021-09-08T03:30:00,392][INFO ][o.e.x.s.SnapshotRetentionTask] [STHLM-KLARA-04] starting SLM retention snapshot cleanup task
[2021-09-08T03:30:00,392][INFO ][o.e.x.s.SnapshotRetentionTask] [STHLM-KLARA-04] there are no repositories to fetch, SLM retention snapshot cleanup task complete
[2021-09-08T07:30:00,557][INFO ][o.e.c.m.MetadataDeleteIndexService] [STHLM-KLARA-04] [klaranatetwebutv2_1/ZPmtXs0rToKdwe_Pz5639Q] deleting index
[2021-09-08T07:30:01,255][INFO ][o.e.c.m.MetadataCreateIndexService] [STHLM-KLARA-04] [klaranatetwebutv2_1] creating index, cause [api], templates [], shards [1]/[0]
[2021-09-08T07:30:01,446][INFO ][o.e.c.r.a.AllocationService] [STHLM-KLARA-04] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[klaranatetwebutv2_1][0]]]).
[2021-09-08T07:30:01,683][INFO ][o.e.c.m.MetadataMappingService] [STHLM-KLARA-04] [klaranatetwebutv2_1/jKF9YttySPuQ6H8EkiDzpw] create_mapping [_doc]

It looks like index are deleted and created, could that be the thing and that they are created without replicas?

/Kristoffer

mbooh · September 8, 2021, 4:34pm

Hi! So I found it. It looks like the index is deleted and created by code using Elasticsearch.Net and when that is done this code is run:

var createIndexResponse = Indices.Create(indexName, createIndex => createIndex
                    .Settings(settings => settings
                        .NumberOfShards(1)
                        .NumberOfReplicas(0)
                        .Analysis(analysis => analysisSettings))
                    .Timeout(TimeSpan.FromSeconds(60)));

No wonder the index was created without replica!

/Kristoffer

system · October 6, 2021, 4:35pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.