[Solved] ElasticSearch - ec2 : Discovery doesn't work with cloud-aws plugin

(Teddy FERDINAND) #1


I have some problems with elasticsearch cloud-aws plugin which doesn't see any other instances.

Here is the configuration i put :

cluster.name: test_tferdinand
node.name: ip-10-203-13-175

        region: eu-west-1
    type: ec2

My instance is using the following IAM Role :

  "Version": "2012-10-17",
  "Statement": [
      "Effect": "Allow",
      "Action": "ec2:Describe*",
      "Resource": "*"

and the following inbound rules :

outbound rule :

Here are the starting logs :
[2016-10-31 08:29:39,938][INFO ][node ] [ip-10-203-13-12 ] version[2.4.1], pid[3311], build[c67dc32/2016-09-27T18:57:55Z]
[2016-10-31 08:29:39,939][INFO ][node ] [ip-10-203-13-12] initializing ...
[2016-10-31 08:29:40,895][INFO ][plugins ] [ip-10-203-13-12] modules [lang-groovy, reindex, lang-expression], pl ugins [cloud-aws], sites []
[2016-10-31 08:29:40,927][INFO ][env ] [ip-10-203-13-12] using [1] data paths, mounts [[/ (/dev/xvda1)]], ne t usable_space [6.5gb], net total_space [7.7gb], spins? [no], types [ext4]
[2016-10-31 08:29:40,927][INFO ][env ] [ip-10-203-13-12] heap size [1015.6mb], compressed ordinary object po inters [true]
[2016-10-31 08:29:40,928][WARN ][env ] [ip-10-203-13-12] max file descriptors [4096] for elasticsearch proce ss likely too low, consider increasing to at least [65536]
[2016-10-31 08:29:43,374][INFO ][node ] [ip-10-203-13-12] initialized
[2016-10-31 08:29:43,374][INFO ][node ] [ip-10-203-13-12] starting ...
[2016-10-31 08:29:43,438][INFO ][transport ] [ip-10-203-13-12] publish_address {}, bound_addresse s {}
[2016-10-31 08:29:43,442][INFO ][discovery ] [ip-10-203-13-12] test_tferdinand/M6F_ibLkT46OTQRkZXXLoA
[2016-10-31 08:29:49,922][INFO ][cluster.service ] [ip-10-203-13-12] new_master {ip-10-203-13-12}{M6F_ibLkT46OTQRkZXXLoA }{}{}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-10-31 08:29:49,957][INFO ][http ] [ip-10-203-13-12] publish_address {}, bound_addresse s {}
[2016-10-31 08:29:49,957][INFO ][node ] [ip-10-203-13-12] started [2016-10-31 08:29:49,962][INFO ][gateway ] [ip-10-203-13-12] recovered [0] indices into cluster_state

Can someone help me before i become crazy? :slight_smile:

(David Pilato) #2

You have a wrong indentation for the region.
Can you check that?

(Teddy FERDINAND) #3


Thanks for your reply :slight_smile:

I have juste failed my copy / paste :cry:

Indentation is fine in my elasticsearch.yml file.

i change my first post

(David Pilato) #4

Can you check if it works well using key/secret instead of IAM roles?
We fixed an issue in 5.0 which might be also in 2.4.1 so that's why I'm asking.


(Teddy FERDINAND) #5

I tried, but still have the same issue.

Here are the starting logs :

[2016-10-31 12:20:42,032][INFO ][node                     ] [ip-10-203-13-175] initializing ...
[2016-10-31 12:20:43,040][INFO ][plugins                  ] [ip-10-203-13-175] modules [lang-groovy, reindex, lang-expression], plugins [cloud-aws], sites []
[2016-10-31 12:20:43,077][INFO ][env                      ] [ip-10-203-13-175] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [6.5gb], net total_space [7.7gb], spins? [no], types [ext4]
[2016-10-31 12:20:43,085][INFO ][env                      ] [ip-10-203-13-175] heap size [1015.6mb], compressed ordinary object pointers [true]
[2016-10-31 12:20:43,085][WARN ][env                      ] [ip-10-203-13-175] max file descriptors [4096] for elasticsearch process likely too low, consider increasing to at least [65536]
[2016-10-31 12:20:44,595][DEBUG][com.amazonaws.AmazonWebServiceClient] Internal logging succesfully configured to commons logger: true
[2016-10-31 12:20:45,589][INFO ][node                     ] [ip-10-203-13-175] initialized
[2016-10-31 12:20:45,590][INFO ][node                     ] [ip-10-203-13-175] starting ...
[2016-10-31 12:20:45,650][INFO ][transport                ] [ip-10-203-13-175] publish_address {}, bound_addresses {[::]:9300}
[2016-10-31 12:20:45,654][INFO ][discovery                ] [ip-10-203-13-175] test_tferdinand/liaBfQ5VSwWL_E4_Ue-mEA
[2016-10-31 12:20:45,678][DEBUG][com.amazonaws.auth.AWSCredentialsProviderChain] Loading credentials from com.amazonaws.internal.StaticCredentialsProvider@41938a35
[2016-10-31 12:20:45,693][DEBUG][com.amazonaws.request    ] Sending Request: POST https://ec2.eu-west-1.amazonaws.com / Parameters: ({"Action":["DescribeInstances"],"Version":["2015-10-01"],"Filter.1.Name":["instance-state-name"],"Filter.1.Value.1":["running"],"Filter.1.Value.2":["pending"]}Headers: (User-Agent: aws-sdk-java/1.10.69 Linux/4.4.19-29.55.amzn1.x86_64 OpenJDK_64-Bit_Server_VM/24.111-b01/1.7.0_111, amz-sdk-invocation-id: c79fd64f-7885-43eb-ae97-ca8fa358e714, )
[2016-10-31 12:20:45,699][DEBUG][com.amazonaws.auth.AWS4Signer] AWS4 Canonical Request: '"POST

user-agent:aws-sdk-java/1.10.69 Linux/4.4.19-29.55.amzn1.x86_64 OpenJDK_64-Bit_Server_VM/24.111-b01/1.7.0_111

[2016-10-31 12:20:45,704][DEBUG][com.amazonaws.auth.AWS4Signer] AWS4 String to Sign: '"AWS4-HMAC-SHA256
[2016-10-31 12:20:45,705][DEBUG][com.amazonaws.auth.AWS4Signer] Generating a new signing key as the signing key not available in the cache for the date 1477872000000
[2016-10-31 12:20:45,871][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] socket.getSupportedProtocols(): [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2], socket.getEnabledProtocols(): [TLSv1]
[2016-10-31 12:20:45,871][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] TLS protocol enabled for SSL handshake: [TLSv1.2, TLSv1.1, TLSv1]
[2016-10-31 12:20:45,872][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] connecting to ec2.eu-west-1.amazonaws.com/
[2016-10-31 12:20:46,201][DEBUG][com.amazonaws.internal.SdkSSLSocket] created: ec2.eu-west-1.amazonaws.com/
[2016-10-31 12:20:46,223][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Attempt 1 to execute request
[2016-10-31 12:20:46,580][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Connection can be kept alive for 60000 MILLISECONDS
[2016-10-31 12:20:46,593][DEBUG][com.amazonaws.requestId  ] x-amzn-RequestId: not available
[2016-10-31 12:20:47,389][DEBUG][com.amazonaws.request    ] Received successful response: 200, AWS Request ID: c6eb4776-99e9-4fc7-bb71-1c9099418c74
[2016-10-31 12:20:47,395][DEBUG][com.amazonaws.requestId  ] AWS Request ID: c6eb4776-99e9-4fc7-bb71-1c9099418c74
[2016-10-31 12:20:52,032][INFO ][cluster.service          ] [ip-10-203-13-175] new_master {ip-10-203-13-175}{liaBfQ5VSwWL_E4_Ue-mEA}{}{}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-10-31 12:20:52,053][INFO ][http                     ] [ip-10-203-13-175] publish_address {}, bound_addresses {[::]:9200}
[2016-10-31 12:20:52,053][INFO ][node                     ] [ip-10-203-13-175] started
[2016-10-31 12:20:52,073][INFO ][gateway                  ] [ip-10-203-13-175] recovered [0] indices into cluster_state

(David Pilato) #6

Apparently the plugin looks to work well with AWS API.

[2016-10-31 12:20:47,389][DEBUG][com.amazonaws.request    ] Received successful response: 200, AWS Request ID: c6eb4776-99e9-4fc7-bb71-1c9099418c74

May be you can change network.host: _ec2_ and see how it goes?

Another test you can try is to run without the AWS plugin and just define the unicast list of nodes manually and check if it's working or not.

If it's working then we can try to figure out what is happening with the plugin.
If not, then you probably have another problem (firewall...)

If we want to make sure what is happening you can also change the log level for org.apache to DEBUG so we can see what exactly is sent back by AWS to the plugin.

(Teddy FERDINAND) #7

I finally got it to work,

Here is the final configuration i made :

        region: eu-west-1

    type: ec2
        groups: [my_security_group]
        host_type: private_ip

    host: _ec2:privateIpv4_

When i request "/_cluster/health", i see "number_of_nodes":3

I think AWS API is returning too much results without filtering these.

Thank you for your help, and sorry for inconvenience.

(system) #8