Error in search, authentification, then NodeNotConnectedException

mohammedEssabri · June 28, 2018, 3:54pm

Hello team,
i would like to ask a question about a bug which is appearing in ou Elasticseach cluster .

Elasticsearch version

1.7

Plugins installed: [license 1.0.0, shield 1.3.2, kopf 1.5.8, head]

JVM version (java -version):
7.0.1

OS version (uname -a if on a Unix-like system):
Linux 4.482-6.9-default x86_64

Description of the problem including expected versus actual behavior:

We have 6 environments: local, test, env3, env4, env5 and prod.
in each environment we have Elasticsearch server and Java EE application which communicating with Elasticsearch to provide search services.
in each environment, we have two nodes for Java EE app and two nodes for Elasticsearch server

The problem is that we had errors in search, then the server is not working correctly, please see steps below.

Steps to reproduce:

Before, everything was working good.
After that, a user of Java EE application did a test in env4 and get an error in search ( caused by a problem in language filter )
Trying to resolve the problem we compared it with prod ( same JEE app version, same ES version and same index mapping, and also same database db2), the result is that it works on PROD.
We deployed the same JEE app and same mapping, same ES version in test ( the same as in env4 ) and it's working
We did the same in env3 and it's not working
We pointed local env to Elasticsearch server and database from test env , and we used the same version of JEE app, and we tested but it's not working
We get the query that the JEE app sends to Elasticsearch and we compared the two cases : when we have the problem and when we don't get the problem, so we arrived to the following Elasticsearch request, we tested it using curl but i didn't give any hits in result:

Note : http://localhost is the ip of our server, i didn't share it for security reasons.

curl -XGET 'http://localhost/_search?pretty' -d'{
  "from" : 0,
  "size" : 6,
  "query" : {
    "filtered" : {
      "query" : {
        "bool" : {
          "must" : [ {
            "bool" : {
              "must" : {
                "multi_match" : {
                  "query" : "EN",
                  "fields" : [ "proceedingslanguage_*", "proceedingslanguage_en^5.0" ],
                  "type" : "best_fields"
                }
              }
            }
          } ]
        }
      },
      "filter" : {
        "type" : {
          "value" : "decision"
        }
      }
    }
  }
}'

The result:

   {
      "took" : 10,
      "timed_out" : false,
      "_shards" : {
        "total" : 27,
        "successful" : 27,
        "failed" : 0
      },
      "hits" : {
        "total" : 0,
        "max_score" : null,
        "hits" : [ ]
      }
    }

After we get error in log: 85% of space disk is full in env4 and env3:

Extract of log in node1:

[2018-06-26 16:29:39,081][INFO ][cluster.routing.allocation.decider] [Elasticsearch_serverName_node2] low disk watermark [85%] exceeded on [-5BLsRz5RTas....][Elasticsearch_serverName_node2] free: 1gb[12.1%], replicas will not be assigned to this node

After that we get error authentification in env4

[2018-06-26 18:36:46,952][WARN ][shield.authc.activedirectory] [Elasticsearch_serverName_node1]
authentication failed for user [userName]: unable to authenticate user [userName] to active directory domain [domaineName] cause: com.unboundid.ldap.sdk.LDAPException: 80090308: LdapErr: DSID-0C090400, comment: AcceptSecurityContext error...

After we get error HTTP interface http://hostName:9300 is not connected

[2018-06-27 17:38:55,370][WARN ][discovery.zen.ping.unicast] [Elasticsearch_serverName_node1] failed to send ping to [[#zen_unicast_1#][Elasticsearch_serverName_node1][inet[/hostName:9300]]]
org.elasticsearch.transport.SendRequestTransportException: [Elasticsearch_serverName_node1][inet[/hostName:9300]][internal:discovery/zen/unicast]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:286)
at org.elasticsearch.shield.transport.ShieldServerTransportService.sendRequest(ShieldServerTransportService.java:59)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPingRequestToNode(UnicastZenPing.java:468)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.access$1000(UnicastZenPing.java:62)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$3.run(UnicastZenPing.java:383)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [Elasticsearch_serverName_node1][inet[/hostName:9300]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:964)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:656)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:276)
... 7 more

we asked responsible team to extend space in disk so the error relative to disk page was resolved.
Even after the step above, the problem of "Node not connected exception" continue appearing in log file.

Thank you and best regards,
Mohammed ESSABRI

system · July 26, 2018, 3:54pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.