Scroll id generation

Hello,

Could someone point me where I can find how the scroll id is generated from a scroll request ?
I have seen that it is a base64 encoded string but I am not sure how the different values are generated.
Thank you !

Hi @macdrai,

may I ask what do you want to do with this string? The idea is that you treat this as an opaque token and just provide what you got to subsequent scroll requests.

Daniel

Sure thing !
We are using a reverse proxy to authenticate our users. I am wondering if the scroll id could be in any way be compromised and timed to access indices that a user should not be able to query via our basic proxy rules. As of yet, the /_search/scroll endpoint is open.
Thank you.

Hi @macdrai,

here is the relevant source code snippet.

As you can see, it encodes state in this string which we later decode. The components (query id and node id) are either easy to guess or constant. So to me this looks as if somebody could tamper with it.

Daniel

I see. If we are able to block access to the nodes id, do you think we could still provide the endpoint securely ?
Best regards.

Hi Simon,

If we are able to block access to the nodes id [...]

How would you do this? By altering the scroll id?

With your solution you can provide authentication ("Who are my users?") and that's fine. But if you want to do authorization ("What are my users allowed to do?") this is an entirely different topic.

Elasticsearch by default is meant to be used in a "trusted" environment. If you need proper authorization I'd recommend you have a look at our commercial solution, called Shield and even if it's just to get an idea of what features you might want (you can even try it for a limited time without any registration just by installing the Shield and license plugins). AFAIK there are also a few community extensions which might help you but I have no experience with them.

Daniel