How does Curator achieve TLS from EC2 to AWS S3?


(Jonas Steinberg) #1


I have a policy on an S3 bucket that will reject anything not using TLS. And my cluster snapshots are making it to S3 so obviously transport from my EC2 box (where Curator is running) to S3 is encrypted, which is great. What I'm wondering though is exactly how Curator/Elasticsearch is making TLS requests to S3? I have done a fair amount of looking and can't quuiiiiiite put the pieces together.

  1. My Amazon Root CA certs are in no cert bundle that is obviously referenced by anything
  2. My openssl.cnf file does not reference any directories that contain a cert bundle with the Amazon Root CA certs
  3. Does Curator use Boto and Boto has some hardcoded paths in which to check for cert bundles? Or does Curator call Elasticsearch or the repository-s3 plugin which in turn has some code that has hardcoded paths to check for cert bundles? I ask this because I see that there are a few python libraries like requests, etc. on my boxes which have some hardcoded file paths in which to check for CA cert bundles...but none of these libraries contain paths that point to my cert bundles which contain the Amazon Root CA certs. And yet asymmetric encryption initiated by Curator/Elasticsearch and between EC2 and S3 is occurring...

I have also searched this forum and the goog and haven't found anything super relevant.

Any insight would be greatly appreciated,

-Jonas Steinberg

(Aaron Mildenstein) #2

Yes. The boto3 module uses the certificate bundles used/provided by Mozilla (which are made available in the certifi python module).

(Jonas Steinberg) #3

Great. Thanks so much for the quick follow up.

(Jonas Steinberg) #4


One last question. I suppose then that since my masters are the only boxes to have Curator and therefore Certifi and its CA bundle then all snapshot data is actually retrieved from the data node repos by the masters and actually sent from the masters to that right?

(Aaron Mildenstein) #5

Curator uses boto to connect to Elasticsearch clusters using the S3 plugin, but the clusters themselves use the S3 plugin to communicate with S3. Curator plays no role after that. You can have Curator point to a master or data node to make the API call, and then Elasticsearch does the rest.

With regards to snapshots, each master node and data node in the cluster must have read/write access to the shared storage the repository uses (in your case, S3). It does not ship index data through the master nodes. Each node sends data from the primary shard(s) it has, and the elected master node ships the cluster state information.

(Jonas Steinberg) #6

Hm, glad I asked, because I have no idea then as to how my data nodes are passing my S3 buckets force TLS requirements. Yes, those data nodes (EC2 instances) have, in certain places, Amazon's Root CA public keys and so a functional handshake to S3 is theoretically possible...but I can find no configuration in openssl.cnf, Elasticsearch or in any other that points to those Amazon Root CA public I've no idea how the encryption in transit is actually working correctly. But is it...

(Aaron Mildenstein) #7

Have you looked at the S3 plugin code? It handles the IAM/S3 stuff, and I'm sure it's in there.

(Jonas Steinberg) #8

@theuntergeek No. For some reason I was unconsciously assuming that it was closed source, but now that I know it's not that will be my next step :slight_smile: thx.

(Jonas Steinberg) #9

@theuntergeek Finally found the answer which was right under my nose the whole time. Java itself includes a cacerts bundle which in a way I knew, but never ultimately realized the purpose of! Hence Curator --> repository-s3 plugin --> AWS Java SDK --> Java cacerts bundle (or some other bundle explicitly specified) --> Amazon Root CA public keys --> verifies keys sent by AWS services such as S3 are trusted and vice-versa --> TLS.

(system) closed #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.