Ingest pipeline routing documents to appropriate target index requires permissions on target index

Suppose you don't trust data-feeding users to place documents into the right index, so you create a virtual index with an ingestion pipeline that determines the proper target index alias based on a few fields in each document (among doing other things).

Users granted the privilege "create_doc" or "write" on the virtual index should be able to use this index for feeding.

However, it seems they also need "create_doc" or "write" privileges on the target indexes determined by the pipeline. This would have the effect, of course, that those users could also feed directly into these target indexes (and choose the wrong one).

The only way I can imagine solving this issue is by introducing an ingestion pipeline for all target indexes that would prevent direct feeding for documents not having passed through the pipeline of the virtual index. In my opinion, however, this should not be necessary: if a user has been granted access to the virtual index, and the ingestion pipeline internally determines the target index, there should not have to be additional permission checks for that user on the target index.

Any ideas how to achieve what I want? This may be a CR for the Elasticsearch platform.

The indexing of the document happens after the ingest pipeline is executed, and the indexing will be done by same authenticated user that made the request, the ingest pipeline does not change the user, that's why it needs to have permissions to the final target index.

It is not clear how you are changing the value of the _index field to change the target index, but the reroute processor explicitly says that.

Note that the client needs to have permissions to the final target. Otherwise, the document will be rejected with a security exception

If you don't trust your data-feeding users in this case, don't give them access to write directly into Elasticsearch. You can put Logstash in front of it, configure an http input and then create your logic to index the data into the correct index.

In this case your users wouldn't even need permission to write into Elasticsearch, just permission to send logs to Logstash, the user configured in Logstash would have the permissions to write in your indices.

1 Like

Hi @Jurgen_Wagner_DVT I wholly agree with @leandrojmp

This quote alone leads me to a solution with a layer outside Elasticsearch such as Logstash, API Gateway etc.

The approach you are suggesting IMHO would be a significant security hole and would be a ripe target for privilege escalation whether intentional or a simple bug in the ingest pipeline.

Allowing a pipeline that is executed with the privilege of the calling user to then access resources in any way that that calling user was not authorized for, would be a security hole / escalation of privileges.

Elasticsearch as a datastore and search engine has a strong security model and many users depend on that, I do not see this as a feature that would be added. Of course, you can always submit an enhancement request and perhaps the Product Managers / Tech Lead will see it differently.

2 Likes

True, the solution which this should ultimately converge to is an independent ingestion service with proper authentication of clients. However, that's what we can only implement in the mid- to long-term perspective, and it is desirable to have as small a number of moving parts as possible.

In a world with trusted and well-tested pipelines, though, it should be possible to have a final permission validation at the ingestion point only. The pipeline should not elevate the privileges of the user other than allowing the pipeline to index documents in the ultimately-chosen index. This is also what a simple microservice to handle documents for ingestion would basically do.

The "run_as" feature won't help me because if I allow a "harmless_feeder" (allowed "create_doc" on the virtual ingesting index) to impersonate the "empowered_target_index_writer" (with "create_doc" on all possible target indexes). This would not keep the "harmless_feeder" from manually selecting a target index and feed into it.

So, the only option available would be to grant "create_doc" for all the virtual index and all relevant potential target indexes, but prohibit the direct feeding into the target indexes by checking whether the request has first been processed first by the pipeline of the virtual index. The virtual index' ingestion pipeline has to add some magic to documents and the target indexes will refuse to index documents without this magic. This sounds a bit awkward and won't simplify testing.

This feature could be easily implemented in Elasticsearch by allowing the use of "run_as" impersonation only from within a pipeline (whose execution we trust more), not from externally-submitted requests (which are entirely under the control of the external clients), e.g., by having a security role configuration item "run_as_from_pipeline".

Anyway, thanks for the feedback.

@Jurgen_Wagner_DVT

Please feel free to open a feature request in the Elasticsarch repo for review, no one on this forum has the authority to make changes to the code. A request from a user provides more context.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.