Some bugs on ILM between logstash and Elastisearch for instance and some propositions about the global process on setting things with ilm

Hi all,

I encountered behaviors for wich I think there are some bugs (ES 7.6.0).
In this message, after exposing those two situations, I try to put on the table some proposals to ease the ilm usage and I try to initiate a debate on it.

Bug #1

It is possible to create an index template which refers to an inexistant lifecycle policy. It is also possible to create a matching index without error message. Nevertheless, the policy is not applied...

Bug #2

Creating things as following is generating an error when reaching the indices.lifecycle.poll_interval on the elasticsearch side.
For instance:
with setting indices.lifecycle.poll_interval set to 58s in elasticsearch.yml

PUT _ilm/policy/logs_policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50GB", 
            "max_age": "60s"    
          }
        }
      },
      "delete": {
        "min_age": "300s",       
                                
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

PUT _template/journeaux
{
  "index_patterns": ["journeaux*"],
  "aliases": {
    "actuels": { "is_write_index": true }
  }, 
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "@version" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "host" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "message" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "lifecycle.name": "logs_policy",
        "lifecycle.rollover_alias": "actuels"
      }
    }
}

PUT journeaux-000001
{
  "aliases": {
    "actuels": {
      "is_write_index": true 
    }
  } 
} 

or

PUT _ilm/policy/logs_policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50GB", 
            "max_age": "60s"    
          }
        }
      },
      "delete": {
        "min_age": "300s",       
                                
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

PUT _template/journeaux
{
  "index_patterns": ["journeaux*"],
  "aliases": {
    "actuels": {}
  }, 
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "@version" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "host" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "message" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "lifecycle.name": "logs_policy",
        "lifecycle.rollover_alias": "actuels"
      }
    }
}

PUT journeaux-000001
{
  "aliases": {
    "actuels": { }
  } 
}

==> generates the following logs in Elasticsearch :

[2020-03-11T14:20:19,017][ERROR][o.e.x.i.IndexLifecycleRunner] [Astrolab] policy [logs_policy] for index [journeaux-000001] failed on step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]. Moving to ERROR step
java.lang.IllegalArgumentException: Rollover alias [actuels] can point to multiple indices, found duplicated alias [[actuels]] in index template [journeaux]
	at org.elasticsearch.action.admin.indices.rollover.TransportRolloverAction.checkNoDuplicatedAliasInIndexTemplate(TransportRolloverAction.java:307) ~[elasticsearch-7.6.0.jar:7.6.0]
	at org.elasticsearch.action.admin.indices.rollover.TransportRolloverAction.masterOperation(TransportRolloverAction.java:134) ~[elasticsearch-7.6.0.jar:7.6.0]
	at org.elasticsearch.action.admin.indices.rollover.TransportRolloverAction.masterOperation(TransportRolloverAction.java:72) ~[elasticsearch-7.6.0.jar:7.6.0]
	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction.lambda$doStart$3(TransportMasterNodeAction.java:170) ~[elasticsearch-7.6.0.jar:7.6.0]
	at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:73) ~[elasticsearch-7.6.0.jar:7.6.0]
	[...]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?]
        [...]

While the following sequence is ok :

PUT _ilm/policy/logs_policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50GB", 
            "max_age": "60s"    
          }
        }
      },
      "delete": {
        "min_age": "300s",       
                                
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

PUT _template/journeaux
{
  "index_patterns": ["journeaux*"],
  "aliases": {}, 
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "@version" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "host" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "message" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "lifecycle.name": "logs_policy",
        "lifecycle.rollover_alias": "actuels"
      }
    }
}

PUT journeaux-000001
{
  "aliases": {
    "actuels": {}
  } 
}

Evolution proposal #1

To ease the ilm usage, it would be nice to modify the index creation sequence.
At the first creation of an inexistant index, it would be nice to look if an alias exists in the index templates list at the 'lifecycle.rollover_alias' value level with that name.
If an index templates references this name as a value of the 'lifecycle.rollover_alias' parameter:

1- get the index_pattern value of the index template (Ex: 'logstash-*')
2- replace the wildcard (*) character of the pattern with a sequence beginning at 000001 (Ex: 'logstash-000001')
3- create the index with this name
4- set an alias on it with the 'lifecycle.rollover_alias' value

If the name is not a refenced in a 'lifecycle.rollover_alias' value of any index template, then return to the old way to create an index using index patterns.

This would greatly ease ilm usage from logstash and beats tools.

For instance, when using the Elasticsearch output of logstash, we could use an 'alias' parameter instead of 'index' one.
In this case, using 'alias' parameter would check for an index template with a 'lifecycle.rollover_alias' value set to this alias name. If inexistant we could ask the user (generating an error) to parameterize index-template + policy on the elasticsearch cluster side.
Using this 'alias' parameter would then replace all the ilm parameters (ilm_enabled, ilm_rollover_alias, ilm_pattern et ilm_policy)

For instance, a user would depending on his choice:

  • use the 'index' parameter only => The user could then parameterize the index name with the old fashioned way.
  • use the 'alias' parameter only => The user shall then define a policy and a define an index template with the corresponding 'lifecycle.rollover_alias' value on the elasticsearch cluster side
  • use the 4 ilm parameters (mandatory as a whole setting) and forbidding 'alias' and 'index' parameters usage: ilm_enabled, ilm_rollover_alias, ilm_pattern et ilm_policy. This would create the correct policy, the corresponding index template with the correct 'lifecycle.rollover_alias'

At the first indexed message with the alias, the corresponding index could then be created.

Evolution proposal #2

Stop proposing a setup feature from external tools (like beats) and incitate the users to install dashboards, templates and other things like this using the API on the Elasticsearch cluster side once for all with the correct packages/zip/tar.gz/json files.
Setup functions suppose at least one cluster node access and when behind proxies, in highly secured environments, or ES is behind logstash instances, we may not have opened routes from beats to cluster on the target enterprise network...

Hope this helps,
Waiting for interesting debates on those proposals :wink:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.