Index become readonly for unknown reason

tomeri · November 5, 2019, 10:13am

I am using ILM via Logstash to manage my indices
I have a new index that was created in 31/10 and for some unknown reason, it has become read-only, I noticed it has block.write=true on it.

My ILM policy shouldn't do this as no condition was met - The policy:

{
    "policy": {
        "phases": {
            "hot": {
                "min_age": "0ms",
                "actions": {
                    "rollover": {
                        "max_size": "100gb"
                    },
                    "set_priority": {
                        "priority": 100
                    }
                }
            },
            "warm": {
                "min_age": "0ms",
                "actions": {
                    "allocate": {
                        "number_of_replicas": 1,
                        "include": {},
                        "exclude": {},
                        "require": {
                            "box_type": "warm"
                        }
                    },
                    "forcemerge": {
                        "max_num_segments": 1
                    },
                    "set_priority": {
                        "priority": 50
                    },
                    "shrink": {
                        "number_of_shards": 1
                    }
                }
            },
            "cold": {
                "min_age": "30d",
                "actions": {
                    "freeze": {},
                    "set_priority": {
                        "priority": 0
                    }
                }
            },
            "delete": {
                "min_age": "365d",
                "actions": {
                    "delete": {}
                }
            }
        }
    }
}

I have DEBUG logs from the Master at the time it became a read-only:

[2019-11-04T12:30:04,705][DEBUG][o.e.i.IndicesService ] [Master] creating Index [[12months-retention-2019.10.31-000001/IQvq2eTrSqyX-Gd476zvLA]], shards [5]/[1] - reason [metadata verification]
[2019-11-04T12:30:04,706][DEBUG][o.e.i.m.MapperService ] [Master] [12months-retention-2019.10.31-000001] using dynamic[true]
[2019-11-04T12:30:04,707][DEBUG][o.e.i.c.b.BitsetFilterCache] [Master] [12months-retention-2019.10.31-000001] clearing all bitsets because [close]
[2019-11-04T12:30:04,707][DEBUG][o.e.i.c.q.IndexQueryCache] [Master] [12months-retention-2019.10.31-000001] full cache clear, reason [close]
[2019-11-04T12:30:04,707][DEBUG][o.e.i.c.b.BitsetFilterCache] [Master] [12months-retention-2019.10.31-000001] clearing all bitsets because [close]
[2019-11-04T12:31:48,285][DEBUG][o.e.x.i.IndexLifecycleRunner] [Master] [12months-retention-2019.10.31-000001] running periodic policy with current-step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]

This is the index template:

{
  "order": 0,
  "version": 60001,
  "index_patterns": [
    "12months-retention-*"
  ],
  "settings": {
    "index": {
      "lifecycle": {
        "name": "12months_policy",
        "rollover_alias": "12months-retention"
      }
    }
  },
  "mappings": {},
  "aliases": {}
}

This is the current index status:

No new index was created, it did not rollover.
The Elasticsearch version is 6.8.3

Please advise, thanks.

dakrone · November 7, 2019, 4:18pm

The read-only block is only applied prior to the forcemerge or shrink actions. It appears though that this index is still in the rollover stage.

Can you provide the output of:

GET /<index-name>/_ilm/explain?human

So I can see where the index currently is?

tomeri · November 10, 2019, 8:48am

Hi,

{
  "indices" : {
    "12months-retention-2019.10.31-000001" : {
      "index" : "12months-retention-2019.10.31-000001",
      "managed" : true,
      "policy" : "12months_policy",
      "lifecycle_date" : "2019-10-31T10:45:01.688Z",
      "lifecycle_date_millis" : 1572518701688,
      "phase" : "hot",
      "phase_time" : "2019-11-10T08:46:32.139Z",
      "phase_time_millis" : 1573375592139,
      "action" : "rollover",
      "action_time" : "2019-10-31T10:50:15.203Z",
      "action_time_millis" : 1572519015203,
      "step" : "check-rollover-ready",
      "step_time" : "2019-11-10T08:46:32.139Z",
      "step_time_millis" : 1573375592139,
      "phase_execution" : {
        "policy" : "12months_policy",
        "phase_definition" : {
          "min_age" : "0ms",
          "actions" : {
            "rollover" : {
              "max_size" : "100gb"
            },
            "set_priority" : {
              "priority" : 100
            }
          }
        },
        "version" : 2,
        "modified_date" : "2019-10-07T13:55:05.451Z",
        "modified_date_in_millis" : 1570456505451
      }
    }
  }
}

Thanks,

dakrone · November 12, 2019, 11:45pm

Okay, judging from that the index should not have been marked as readonly. Do you know if any other process could have done this? It appears that ILM is not in a position to do it.

Also, does this problem reproduce on any of your other indices? And can you confirm this is the write block and not the write-but-allow-deletes block?

tomeri · November 16, 2019, 9:58am

Hi,

I removed that index and started over another index with a completely different policy and alias:

The index became read-only again:

This is a different cluster from the one in the initial post
I have the exact same configuration on other clusters, all with the same version, but the issue seems to not occur there.

The ILM policy:

 {
    "policy": {
        "phases": {
            "hot": {
                "min_age": "0ms",
                "actions": {
                    "rollover": {
                        "max_size": "100gb"
                    },
                    "set_priority": {
                        "priority": 100
                    }
                }
            },
            "warm": {
                "min_age": "0ms",
                "actions": {
                    "allocate": {
                        "number_of_replicas": 1,
                        "include": {},
                        "exclude": {},
                        "require": {
                            "box_type": "warm"
                        }
                    },
                    "forcemerge": {
                        "max_num_segments": 1
                    },
                    "set_priority": {
                        "priority": 50
                    },
                    "shrink": {
                        "number_of_shards": 1
                    }
                }
            },
            "cold": {
                "min_age": "30d",
                "actions": {
                    "freeze": {},
                    "set_priority": {
                        "priority": 0
                    }
                }
            },
            "delete": {
                "min_age": "365d",
                "actions": {
                    "delete": {}
                }
            }
        }
    }
}

BTW the ILM is controlled by Logstash using the following configuration:

    elasticsearch {
        hosts => ['xxx1', 'xxx2']
        sniffing => true
        ilm_enabled => true
        ilm_policy => "ltr12_policy"
        ilm_rollover_alias => "ltr12-services"
        manage_template => true
        template_name => "ltr12-services"
        template => "/etc/logstash/index_templates/ltr12-services.json"
        template_overwrite => true
    }

tomeri · November 16, 2019, 10:21am

More findings

It seems like the index become read-only every day at the same time (12:30 EDT)
Looking at the Master (debug) logs at that time shows these:

[2019-11-15T11:45:12,649][DEBUG][o.e.x.i.IndexLifecycleRunner] [Master] [ltr12-services-2019.11.10-000001] running periodic policy with current-step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]
[2019-11-15T11:55:12,652][DEBUG][o.e.x.i.IndexLifecycleRunner] [Master] [ltr12-services-2019.11.10-000001] running periodic policy with current-step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]
[2019-11-15T12:05:12,653][DEBUG][o.e.x.i.IndexLifecycleRunner] [Master] [ltr12-services-2019.11.10-000001] running periodic policy with current-step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]
[2019-11-15T12:15:12,659][DEBUG][o.e.x.i.IndexLifecycleRunner] [Master] [ltr12-services-2019.11.10-000001] running periodic policy with current-step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]
[2019-11-15T12:25:12,653][DEBUG][o.e.x.i.IndexLifecycleRunner] [Master] [ltr12-services-2019.11.10-000001] running periodic policy with current-step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]
[2019-11-15T12:30:04,357][DEBUG][o.e.i.IndicesService ] [Master] creating Index [[ltr12-services-2019.11.10-000001/gipLHlPzSeedZBWyiQRQNw]], shards [2]/[1] - reason [metadata verification]
> [2019-11-15T12:30:04,358][DEBUG][o.e.i.m.MapperService ] [Master] [ltr12-services-2019.11.10-000001] using dynamic[true]
> [2019-11-15T12:30:04,359][DEBUG][o.e.i.c.b.BitsetFilterCache] [Master] [ltr12-services-2019.11.10-000001] clearing all bitsets because [close]
> [2019-11-15T12:30:04,359][DEBUG][o.e.i.c.q.IndexQueryCache] [Master] [ltr12-services-2019.11.10-000001] full cache clear, reason [close]
> [2019-11-15T12:30:04,359][DEBUG][o.e.i.c.b.BitsetFilterCache] [Master] [ltr12-services-2019.11.10-000001] clearing all bitsets because [close]
[2019-11-15T12:35:12,659][DEBUG][o.e.x.i.IndexLifecycleRunner] [Master] [ltr12-services-2019.11.10-000001] running periodic policy with current-step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]
[2019-11-15T12:45:12,660][DEBUG][o.e.x.i.IndexLifecycleRunner] [Master] [ltr12-services-2019.11.10-000001] running periodic policy with current-step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]
[2019-11-15T12:55:12,654][DEBUG][o.e.x.i.IndexLifecycleRunner] [Master] [ltr12-services-2019.11.10-000001] running periodic policy with current-step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]
[2019-11-15T13:05:12,652][DEBUG][o.e.x.i.IndexLifecycleRunner] [Master] [ltr12-services-2019.11.10-000001] running periodic policy with current-step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]

Can you help explain this behavior?

tomeri · November 19, 2019, 3:28am

Finally found the root cause
It was a periodic Curator job which runs index.block.write & forceMerge job on any index older than X days.
Curator should skip ILM managed indices by default, this is why i didn't check that before
But it founds out it's only have that supported since version 5.7

I solved this by upgrading the Curator to 5.7

system · December 17, 2019, 3:28am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Lifecycle policy setting index read-only even with the option disabled Elasticsearch ilm-index-lifecycle-management	2	876	October 17, 2021
Action:ReadOnly appears to set index.blocks.write not index.blocks.read_only Elasticsearch ilm-index-lifecycle-management	2	510	August 30, 2022
ILM policy "cold" phase cannot freeze if index is read_only Elasticsearch ilm-index-lifecycle-management	2	418	October 5, 2020
Index lifecycle policy - warm phase not readonly Elasticsearch ilm-index-lifecycle-management	3	544	July 28, 2021
Index lifecycle not working because of "FORBIDDEN/12/index read-only / allow delete" Elasticsearch ilm-index-lifecycle-management	5	1461	April 15, 2020

Index become readonly for unknown reason

Related topics