Es 6.2.4 upgrade to es 6.3.0 but cannot join the cluster

Hi, everyone. I encounter an upgrade problem and really don't know how to deal with it.
The thing is that we need to upgrade es 6.2.4 to es 6.3.0, the cluster is build on some VM machines and the os is Ubuntu 16.04, there are master node, data node and client node in the cluster. We use security function and monitor function of x-pack in es 6.2.4 to According to the official guide, we can use rolling up upgrading to upgrade 6.2 to es 6.3. Here are my steps on master node:

  1. First, disable the shard allocation and enable flush_sync:

      PUT _cluster/settings
      {
           "persistent": {
                "cluster.routing.allocation.enable": "primaries"
            }
      } 
      POST _flush/synced
    
  2. Download es 6.3.0 and cover the config file with old es 6.2.4's config. According to official guide line, x-pack is a build in module in es from 6.3 so I comment the x-pack config in the config file:
    config in 6.2.4 about x-pack:

     xpack.watcher.history.cleaner_service.enabled: true
     xpack.ml.enabled: false
     #xpack.security.enabled: false
     xpack.security.transport.ssl.enabled: true
     xpack.security.transport.ssl.verification_mode: certificate
     xpack.security.transport.ssl.keystore.path: certs/cert-node.p12
     xpack.security.transport.ssl.truststore.path: certs/cert-node.p12
    
     xpack.monitoring.exporters:
       id2:
         type: http
         host: ["http://XX.XX.XX.XX:9200"]
         auth.username: elastic
         auth.password: elastic
    
     xpack.security.authc:
       anonymous:
         username: anonymous_user
         roles: asy_role1, asy_role2
         authz_exception: true
    

in es 6.3.0 I only keep one line:

xpack.security.enabled: true

Then I started es 6.3.0, there is no error message and the process seems to start up normally, but the node just cannot join the old cluster. It seems it even doesn't know the other master nodes' exist, and from the old cluster it also has nothing about this new node in log.
First I think it may be because of the security thing, so I copy the config/certs dir of es 6.2 to es 6.3 and uncomment the following config:

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: certs/cert-node.p12
xpack.security.transport.ssl.truststore.path: certs/cert-node.p12

When I started es 6.3 again, some error log was found:

Caused by: java.io.IOException: keystore password was incorrect
    ...
Caused by: java.security.UnrecoverableKeyException: failed to decrypt safe contents entry: javax.crypto.BadPaddingException: Given final block not properly padded. Such issues can arise if a bad key is used during decryption.
    ...

It seems that es6.3 cannot use the es 6.2's certificate file. The certificate file is a global one and all nodes in es6.2 are the same. I started a new node with es6.2.4 use this cert is OK, the node join the cluster normally.
As this is a live cluster, so we cannot just stop the cluster and start a new one.
Now I really don't know what to do. Can some help? Thank you so much!

The fact that x-pack is built in, doesn't mean that you don't need to configure it :slight_smile: You need to keep all the config you had in the previous elasticsearch.yml

Please, always share the full stack trace and not just small parts of this. This is really important for people trying to assist. If you don't give us all the information, we won't be able to help you.

The error is self-explanatory:

Caused by: java.io.IOException: keystore password was incorrect

From what I see, you have

xpack.security.transport.ssl.keystore.path: certs/cert-node.p12
xpack.security.transport.ssl.truststore.path: certs/cert-node.p12

One of these two ( or maybe both ) are password protected files, so you need to tell Elasticsearch how to decrypt them.

I would guess that in your 6.2 configuration there is either a

xpack.security.transport.ssl.keystore.password:
xpack.security.transport.ssl.truststore.password:

section in elasticsearch.yml or

xpack.security.transport.ssl.keystore.secure_password and xpack.security.transport.ssl.truststore.secure_password are stored in the elasticsearch keystore ( you can check with bin/elasticsearch-keystore list in an old node.

Thank you so much for your kindly reply. Actually, I don't find

xpack.security.transport.ssl.keystore.password:
xpack.security.transport.ssl.truststore.password:

in the config file, and I do find a elasticsearch.keystore file in config dir. I think this is what you said

``xpack.security.transport.ssl.keystore.secure_passwordandxpack.security.transport.ssl.truststore.secure_passwordare stored in the elasticsearch keystore

right?
I ran

bin/elasticsearch-keystore list

in old node and the output is

keystore.seed
xpack.security.transport.ssl.keystore.secure_password
xpack.security.transport.ssl.truststore.secure_password

But when I ran same command in new node I only got

keystore.seed

I may know what happened but I still don't know how to add the same key-store to the new node. I copy the elasticsearch.keystore file to the new node but it seems no help. Should I ran bin/elasticsearch-keystore create to create a new one? I wonder if the new created one is the same with the old?

You dont need to recreate the keystore, you just need to add the passwords to the keystore in the new node with

bin/elasticsearch-keystore add xpack.security.transport.ssl.keystore.secure_password
bin/elasticsearch-keystore add xpack.security.transport.ssl.keystore.secure_password

but you need to know what the password is.

Yes, I just find this in security guide line. But I don't know the password and the guy who build up this cluster has already gone. Is there another way to get the password or can I change the old password without stopping the cluster service? Many thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.