ECK: Added nodes and now queries are unable to find stored scripts

ES: 6.8.13

We're in the process of moving from manually maintained ES clusters to Elastic Cloud on Kubernetes (ECK). In our first test environment (a single node cluster), we added 3 ECK nodes. Afterward, all our searches that engaged a stored script started failing.

Status = 404
{
   "error": {
      "root_cause": [
         {
            "type": "resource_not_found_exception",
            "reason": "unable to find script [xxx] in cluster state"
...

As I went to list the scripts, I discovered there were none there any more.
GET _cluster/state/metadata?pretty&filter_path=**.stored_scripts

I found this elastic discussion article where someone reported something similar but they never replied to @spinscale question.

I assume that stored scripts are replicated when a new node is added, correct? I have since tested adding some of our scripts back and those show up in the list and our queries that rely on them work again. So, where could they have gone? :man_shrugging: I'm worried we're missing something and, if not, I certainly don't want to have to reapply all our scripts in every environment where we move to ECK.

Scripts are stored in the cluster state indeed. When you moved over to ECK, how did you ensure that the scripts were moved over was well? As it is part of the cluster state, every node has it.

Can you share your migration path? Maybe that sheds some light..

@spinscale Thanks for the response.
Sure, all we did was just add the new ECK nodes to the existing single node cluster in preparation to retire the original (which we still haven't done). All of the indexes replicated just fine. The scripts were a different story as I described above so I was figuring that we did something wrong or made some assumptions about how stored scripts are stored and\or propagated. Sounds like they should also have propagated to the new nodes just like the index data did. :thinking:

That sounds about right. If you run

GET _cluster/state/metadata?filter_path=metadata.stored_scripts

do you see your stored scripts?

Nope.

Hm, running out of ideas a little. Can you check the logs, if there is any error or exception listed?

Also, from which version to which version did you run the upgrade? Which JVM versions are you using?

Thanks!

Can you share your migration path? Maybe that sheds some light..

We mostly followed this document Remote clusters | Elastic Cloud on Kubernetes [master] | Elastic . Our scenario is the "Connect from an Elasticsearch cluster running outside the Kubernetes cluster".

Also, from which version to which version did you run the upgrade?

We didn't upgrade. Both clusters were and still are running 6.8.13 . We're using ECK 1.6 if that's relevant to troubleshooting.

Which JVM versions are you using?

I believe this is the information you're looking for:
$JAVA_HOME = /opt/jdk-15+36 [root@esdev-es-all-128gi-0 bin]# /opt/jdk-15+36/bin/java -version
openjdk version "15" 2020-09-15
OpenJDK Runtime Environment AdoptOpenJDK (build 15+36)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 15+36, mixed mode, sharing)

aaaaaah. That sheds some light and explains the behaviour. Glad we found it.

Side note: Make sure to select the documentation that fits to your ECK version on the right navigation bar. You've been reading the one for the master branch.

So, a remote cluster is not the same than a node joining the cluster. A remote cluster allows to connect two clusters together, while they are still two independent clusters. This is useful for cross cluster search for cross cluster replication. See Remote clusters | Elasticsearch Guide [7.13] | Elastic

This is the reason why the scripts are not synced up, as you still have two fully independent clusters.

@spinscale Good call out, we should have made sure we were referencing the documentation for ES 6.8.
Speaking of ES 6.8, as I've mentioned, that's the version we're using. So, that precludes cross-cluster replication as the cause of scripts not replicating, right?

Exactly.. remote clusters do initiate a permanent connection to another cluster, but it is 'only' usable for CCS and CCR and does not behave like a node joining a cluster and retrieving the cluster state and using it locally.

Sorry, I'm saying , we're not using a version of ES which supports CCR so that cannot be what caused the issue we experienced.

You are using the remote cluster functionality, which will not copy any scripts over into the cluster state. The remote cluster functionality is basically the foundation for CCS and CCR, but does not imply you use those, it's just the way you connect to a remote cluster.

I am not sure I fully understood your last post.. please explain :slight_smile:

We found the issue here it seems, but you want to do achieve something else, right?

I see, since you were referring to CCS and CCR which are not part of the version of ES we're using, I assumed that couldn't be the problem. It sounds like the "remote cluster" functionality is available in ES 6.8 and is leveraged by ECK via the method we used.
Thanks for the explanation!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.