X-Pack role cannot be created or modified


#1

Hi,

I'm experiencing issue when creating a new role which results in the following error message:

{"error":{"root_cause":[{"type":"illegal_state_exception","reason":"role cannot be created or modified as service cannot write until template and mappings are up to date"}],"type":"illegal_state_exception","reason":"role cannot be created or modified as service cannot write until template and mappings are up to date"},"status":500}

Message is quire cryptic and hard to say what is actually going on. Would appreciate any suggestions.

Experimenting with x-pack on existing cluster (5.1.1) which have a bunch of existing indexes already.

Request:

POST /_xpack/security/role/my_new_role -d '{
"cluster": ["monitor"],
"indices": [
{
"names": [ "*" ],
"privileges": ["monitor", "delete_index"]
}
]
}'

Thanks.


(Tim Vernum) #2

This usually indicates that there is an Elasticsearch version upgrade that hasn't completed yet. When an upgrade is in progress, particularly if it's a rolling upgrade, there are restrictions on updates to security users/roles so that nodes with different versions don't create incompatible changes.

Occasionally that upgrade logic can cause issues.

To diagnose what's going on, can you walk through the following steps:

  • Does your cluster have multiple nodes?
  • If so, please double check that they are all running the exact same version:
    Run GET /_nodes/_all/- and check the "version" field for each node is the same.
  • Check the versions of your security mappings in both the index and template:
    • Template: Run GET /_template/security-index-template and check the _meta.security-version for each type in the mappings section.
    • Index: Run GET /.security/_mappings (you will need to run this as a superuser, such as the builtin elastic user) and check the _meta.security-version for each type. They should all match the version that your cluster is running.

We should be able to work out what's going on from that.


#3

I followed steps suggested in your response. Note that I've disabled x-pack temporarily but I presume that it should have no impact on tests below?

Does your cluster have multiple nodes?

Yes. There is several nodes in the cluster and all of them run same exact version. "version": "5.1.1" in this case.

Template: Run GET /_template/security-index-template and check the _meta.security-version for each type in the mappings section.

Again, all of the types in the mappings sections have the same security-version:
"_meta": { "security-version": "5.1.1" }

Index: Run GET /.security/_mappings and check the _meta.security-version for each type.

Similarly here, all types in the index mappings seem to have the same version matching version of the cluster:
"_meta": { "security-version": "5.1.1" }

Thanks.


#4

Is there anything else I could do to try and figure out why role cannot be created? Could removing security-index-template and .security index itself help before re-enabling x-pack?

Any suggestions will be greatly appreciated. Thanks.


(Tim Vernum) #5

Sorry for not getting back to you sooner.

Removing the .security index might help, but without knowing exactly what's causing your problem, it's hard to say for sure.

Can you try turning on DEBUG for org.elasticsearch.xpack.security.authz.store.NativeRolesStore and check for log messages during startup?

  • Turn on debugging by adding this line to your elasticsearch.yml
logger.org.elasticsearch.xpack.security.authz.store.NativeRolesStore: DEBUG
  • Then restart the node that you are testing against (you don't need to restart the whole cluster, just the one that you're sending requests to)
  • Check elasticsearch.log for messages with [NativeRolesStore] in them.

You should see something like:

security template [security-index-template] does not exist or is not up to date, so service cannot start

or

mapping for security index not up to date, so service cannot start

There should be other messages too - if you can post all the relevant ones here, or send them to me in a private message, we should be able to get to the bottom of this.


#6

I've re-enabled x-pack, added logger.org.elasticsearch.xpack.security.authz.store.NativeRolesStore: DEBUG in config and redeployed the cluster. First thing I noticed is that I could no longer authenticate - it seem that once .security index gets created (first time I rolled x-pack out) and x-pack is subsequently turned off and then re-enabled it somehow confuses the system. I had to disable x-pack again to skip auth and drop .security index and followed up with redeployment of x-pack enabled version - this allowed me to authenticate with default elastic user again.

I subsequently performed all the checks as per your instructions above, and can confirm that

  1. All nodes in the cluster run the same version 5.1.1.
  2. _meta.security-version for both security-index-template template and .security index mappings have same version matching cluster nodes: 5.1.1.

Next, attempt to create a new role appeared to be successful this time:

curl -XPOST -k -u elastic:changeme https://elasticsearch:9200/_xpack/security/role/test_role -d '
{ "cluster": ["monitor"],
"indices": [
{
"names": [ "*" ],
"privileges": ["monitor", "delete_index"]
}
]
}'

result:

{"role":{"created":true}}

And corresponding Elasticsearch logs:

[2017-07-27T21:38:44,405][INFO ][o.e.c.m.MetaDataCreateIndexService] [es-master-0] [.security] creating index, cause [auto(index api)], templates [security-index-template, template_2, template_1], shards [1]/[0], mappings [kubernetes, role, reserved-user, user]
[2017-07-27T21:38:44,765][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
[2017-07-27T21:38:44,828][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
[2017-07-27T21:38:44,841][INFO ][o.e.c.m.MetaDataUpdateSettingsService] [es-master-0] updating number_of_replicas to [9] for indices [.security]
[2017-07-27T21:38:44,983][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
[2017-07-27T21:38:45,053][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
[2017-07-27T21:38:45,113][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
[2017-07-27T21:38:45,181][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
[2017-07-27T21:38:45,272][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
[2017-07-27T21:38:45,274][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] invalidating role [test_role] in cache
[2017-07-27T21:38:45,464][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
[2017-07-27T21:38:45,511][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
...

Listing of all roles returned expected results:

curl -k -u elastic:changeme https://elasticsearch:9200/_xpack/security/role
{
[... default roles ...]
"test_role": {
"cluster": [
"monitor"
],
"indices": [
{
"names": [
"*"
],
"privileges": [
"monitor",
"delete_index"
]
}
],
"run_as": [],
"metadata": {}
}
}

All good so far, however when creating a user:

curl -XPOST -k -u elastic:changeme https://elasticsearch:9200/_xpack/security/user/test_user -d '
{
"password" : "changeme",
"roles" : [ "test_role" ],
"full_name" : "Test User",
"email" : "testuser@example.com"
}'

result:

{"error":{"root_cause":[{"type":"illegal_state_exception","reason":"user cannot be created or changed as the user service cannot write until template and mappings are up to date"}],"type":"illegal_state_exception","reason":"user cannot be created or changed as the user service cannot write until template and mappings are up to date"},"status":500}

Elasticsearch log:

[2017-07-27T21:41:53,472][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
[2017-07-27T21:41:53,564][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
[2017-07-27T21:41:53,618][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-master-0] mapping for security index not up to date, so service cannot start
...

The content of .security index:

curl -k -u elastic:changeme https://elasticsearch:9200/.security/_search

{"took":30,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":".security","_type":"role","_id":"test_role","_score":1.0,"_source":{"cluster":["monitor"],"indices":[{"names":["*"],"privileges":["monitor","delete_index"]}],"run_as":[],"metadata":{}}}]}}

Worth noting that existing cluster on which I'm trying to enable x-pack was upgraded from version 2.4.1 to 5.1.1. Could this be a reason?

I also tried a brand new cluster deployment (no pre-existing indexes) and was able to successfully create role and user, with the relevant logs below:

[2017-07-27T16:26:59,524][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-client-0] native roles store waiting until gateway has recovered from disk
[2017-07-27T16:26:59,524][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-client-0] security template [security-index-template] does not exist or is not up to date, so service cannot start
[2017-07-27T16:27:05,777][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-client-0] security template [security-index-template] does not exist or is not up to date, so service cannot start
[2017-07-27T16:27:05,777][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-client-0] security template [security-index-template] does not exist or is not up to date, so service cannot start
[2017-07-27T16:27:05,777][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-client-0] security template [security-index-template] does not exist or is not up to date, so service cannot start
[2017-07-27T16:27:06,261][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-client-0] security index [.security] does not exist, so service can start

[2017-07-27T16:28:12,320][DEBUG][o.e.x.s.a.s.NativeRolesStore] [es-client-0] invalidating role [my_role] in cache

[2017-07-27T16:28:46,441][INFO ][o.e.x.s.a.u.TransportPutUserAction] [es-client-0] added user [my_user]

Any ideas on how to debug it further will be greatly appreciated. Thanks.


(Tim Vernum) #7

Is there anything left to resolve?

It appears that somewhere between:

  • Restarting the nodes
  • Disabling + enabling X-Pack
  • Dropping and recreating .security

everything has gone into a working state.

My guess is that the component that tracks the version of your security index ended up in an incorrect state and the restart fixed it. That component has been completely changed in recent versions to try and overcome those sorts of problems.

If there's something left to debug, I'm happy to help but it does look like you have a working system now.


#8

I'd like to enable that on existing cluster with bunch of pre-existing indexes but this proves to be difficult. Starting from scratch (same cluster version) is working but I'd rather have it enabled on main cluster.

You mentioned that x-pack changed in recent versions to overcome some of the issues above. Would you advise to try and upgrade version of the stack to the latest? I presume there was no major compatibility issues between 5.1.1 and current 5.5.1.

Thanks


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.