ElasticSearch 5.6 Using ingest processors to modify GET response

Hello all!

I want to implement an ElasticSearch plugin to pseudonymize data (or use something that already exists, but I didn't find anything that fits). I can use ingest nodes and pseudonymize fields via pipeline before my data is being indexed. But how can I transform my data back when using http GET? I don't think I can use pipeline in GET method.

My clients use two services. One service puts some data in ElasticSearch like
PUT http://127.0.0.1:9203/customer/doc/1
And the other client is Kibana, that should read the decrypted data .

I can encrypt the data via my plugin and use pipeline I don’t have to change much on client’s slide. Just
PUT http://127.0.0.1:9203/customer/doc/1?pipeline=encrypt_data
In my lack of knowledge, I thought we will be able to use something like:
GET http://127.0.0.1:9203/customer/doc/1?pipeline=decrypt_data

Can anyone point me to how I can try to resolve my problem?

ElasticSearch 5.6 (we don't use LogStash so FingerPrint plugin is not an option).

Best regards,
PK

The thing is that if you can "decrypt" the data it means to me that the data is not really anonymized.

Anyway this feature request has been closed recently but may be you can also ask to reopen it if it suits your needs:

It's anonymized at rest. I can have a plugin that encrypts request before it gets indexed so I assumed I could have a plugin/processor that can decrypt some fields before presenting response from search query to the user.

Maybe I will simplify my question: is there a way to modify a field from search response before it is returned to the requester?

May a script field with a painless script?
Or write your own REST endpoint as a plugin?

But I'm unsure about the use case.

Anyway if you take xpack it has a security feature (commercial) which allows you depending on the connected to show or hide fields. It's available OOTB without any code to write or maintain. May be that's what you need.

The use case is as follows:

  1. User 1 inserts the data eg. http://127.0.0.1:9203/customer/doc/1
{
	"name": "TEST",
	"surname": "TEST2"
}
  1. Plugin intercepts this request and changes it via Ingest node plugin. So before indexing request looks like for example:

{
"name": "f8h3490f3",
"surname": "f4ij30fj043j"
}

  1. This data gets indexed so we don't store TEST and TEST2 values, but hashes instead.
    ------------------ Till this point I have that functionality implemented. The problem is below -----------------
  2. When user 2 searches for the data via eg. http://127.0.0.1:9203/customer/_search
    he should get decrypted data. Not hashes like "f8h3490f3" but real data like "TEST". But when somebody breaks into our server and want to read the data on the hard drive, he can see only hashes.

So the problem is not HOW to decrypt the data. The problem is that when I make a search query, I want to intercept the response which would be:

"hits": {
    "total": 5,
    "max_score": 1,
    "hits": [
        {
            "_index": "customer",
            "_type": "doc",
            "_id": "5",
            "_score": 1,
            "_source": {
                "name": "f8h3490f3",
                "surname": "f4ij30fj043j"
            }
        },

and change the hashes to "TEST" and "TEST2" via my dehashing/decrypting method.

So you want to crypt the disk.

Use something like dm-crypt for example. Note that will generally slow down a bit your system but that's probably the safest option.

I don't really want to encrypt the entire file system.
Moreover I have another use case that I wanted to add to this plugin.

The other use case is as follows:
I want to remove some fields from the response based on requester's permissions, so ability to modify response "on the fly" via custom plugin would be great for that.

I want to remove some fields from the response based on requester's permissions, so ability to modify response "on the fly" via custom plugin would be great for that.

That's what x-pack is providing out of the box: Setting Up Field and Document Level Security | X-Pack for the Elastic Stack [6.2] | Elastic

Well, that's true, but to be more specific only PLATINUM x-pack subscription provides this. We don't want to upgrade our licence...

Which license do you have? Basic? Gold?

Basic

Hello again,

After a discussion with decision-makers in my company we would like to enhance our licence to PLATINUM, but our use case remains: If we will have XPack Platinum, then recommended solution is to encrypt data at rest using dm-crypt.
We've got a lot of data flying back and forward and just up to 3 fields with personal data that we would like to pseudonymize according to GDPR regulations.

Do You plan to have this feature in near future, because encrypting all of our ES data / whole volumes would be too much overhead.

I moved your question to #x-pack so experts on that field can help.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.