I have a few questions around what encryption options are available in Elasticsearch:
• Does Elasticsearch offer encryption of data at rest (in other words, of all the data it is storing)?
• Does it offer encryption at the index level?
• Does it offer encryption at the field (or column-level, if we were talking in relational database parlance)?
Elasticsearch does not have builtin encryption.
We support running on top of infrastructure level encryption via dm-crypt (for platinum level customers), and there are options for encrypting or hashing incoming data via Logstash beforing indexing it into Elasticsearch.
Thank you for the response, @TimV .
With regards to field and document level encryption, how is that decrypted to be able to search on it? I'd assume that for it to work, the decrypted data is indexed, it's encrypted when stored, and somehow searched and decrypted when returning results? Is that decrypted on-the-fly? Any documentation where I can read-up on how field and document level encryption are done?
There is no field or document level encryption within Elasticsearch, so if you encrypt or hash data prior to indexing, e.g. using Logstash, you will need to handle that translation at the application level as the encrypted/hashed values is what will be indexed and searchable.
Thanks, @Christian_Dahlqvist. That's what I was assuming, but I wanted to make sure that my assumption was correct.
So, it's quite pointless even to encrypt a document or a document's field, since I won't be able to search on it. Basically?
You can still filter or aggregate on encrypted/hashed keywords, but doing any kind of free-text search is not really possible.
"pointless" depends on your purpose for encryption, and the problem you are trying to solve. Your high level options are:
Store encrypted data in ES. It is not searchable, nor can it be used in aggregations, but any clients that have the correct keys can decrypt the data and make sense of it. This implies that you are using ES simply as a storage system for that field, not a search engine.
Store hashed data using a stable keyed hash. If you configure your hashing process so that it produces the same values for identical input, then you can aggregate & data match on identical values, but you cannot reverse the hashing, nor can you search for original input values (unless you have the key)
Store hashed data using an un-keyed hash. You can aggregate & data match on identical values. You cannot reverse the hashing, but you can you search for original input values by passing them through the same hash and searching for the hashed-velue. You can only perform full-match "keyword" style searches (no prefixes, etc) due to the hash.
Alternatively, you run ES on encrypted volumes (e.g dm-crypt) in which case everything is encrypted at the storage layer, but is in plain-text at the application layer.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.