Add ability to cache HTTP redirects in Elasticsearch output

rayharris-ibm · February 3, 2017, 6:56pm

Hi everyone! I'm a software engineer at IBM in the logging service. We run several large multi-tenant Elasticsearch clusters to support both internal and external customers. An issue we're running into is that we need the ability, for various reasons, to migrate tenants from one cluster to another (often in different data centers).

Our multi-tenant solution uses a custom written ES proxy that authenticates clients and applies capping, throttling, and blocking based on the state of the tenant's account. As part of our solution to tenant migration, we're implementing what we're calling "redirect caching" in both libbeat and the logstash-output-elasticsearch plug-in. This would allow our ES proxy servers to automatically move a tenant's connections from one data center to another automatically without the need for the tenant to reconfigure their log shippers.

We would like feedback as to whether this would be something the community would be interested in having included in the Elasticsearch output for Beats and Logstash. It would benefit us to have it included so that our customers could use "off-the-shelf" shippers instead of us having to maintain a fork of Beats and Logstash.

Below is a description of how redirect caching would work. And here are the links to the code for both Beats and the logstash-output-elasticsearch plugin:

Let me know what you think about the idea. Thanks!

Ray Harris, Software Engineer
IBM Cloud

HTTP Redirect Caching

As part of our investigation into implementing tenant migration, we looked into adding to the log shippers the ability to cache HTTP redirects so that follow-on HTTP requests during that session would use the original redirect.

As an example, consider this. A tenant is using the logging service in data center 1 (dc1). Without redirect caching, the process of shipping a log is as follows:

The log ingestion endpoint is logs.dc1.example.com
The shipper looks up the IP of the endpoint in DNS
The shipper connects to the IP using HTTPS and basic authentication
The server checks the tenant's account and returns a 200 if allowed to send logs
The shipper starts sending logs

If we want to migrate the tenant to a different data center (eg, dc2), without redirect caching this is the process:

Behind the scenes, we migrate the tenant's existing logs to the new data center
For a short time we stream their logs from dc1 to dc2
We allow the tenant to ship logs to either dc1 or dc2 to submit logs
The tenant has to change the endpoints of their shippers to logs.dc2.example.com
We block the client from connecting to dc1 and they can only connect to dc2

This process is not scalable and error prone. With redirect caching, the process of shipping a log changes slightly, but will enable the migration process to be automatic. Here's how the redirect caching would work:

The log ingestion endpoint is logging.example.com
The shipper looks up the IP of the endpoint in DNS
The shipper connects to the IP using HTTPS and basic authentication
The proxy looks up the tenant and determines what their current datacenter is
The response to the initial connection is an HTTP 302 redirect to the correct data center
The shipper caches the redirect
The shipper connects to the redirect using HTTPS and basic authentication
The The server checks the tenant's account and returns a 200 if allowed to send logs
The shipper starts sending logs

Now, with redirect caching, if we want to migrate a tenant to dc2, this is the process:

Behind the scenes, we migrate the tenant's existing logs to the new data center
We stream their logs from dc1 to dc2 until the logs are in sync
We change the tenant's account to indicate they are now in dc2
The servers in dc1 are informed of the change in data center for the tenant
The servers in dc1 close all active connections from the tenant's shippers
The shippers reconnect, but this time instead of a 200, they get a redirect to dc2 as described above
The shippers now connect and send logs to dc2

The process with redirect caching is scalable and can be automated. The migration can be initiated by the tenant or by the logging service either manually or automatically. The tenant doesn't have to reconfigure their shippers (which often involves redeploying servers or containers).

system · March 3, 2017, 6:57pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem logs redirection Elasticsearch	6	1459	December 26, 2018
ElesticSearch REST HTTP request through redirected URL Elasticsearch	3	5340	May 8, 2020
Is it possible to output to an elasticsearch instance via reverse proxy uri? Logstash	1	527	February 11, 2019
ELastic search cluster with logstash Elasticsearch ccs-cross-cluster-search	1	474	September 8, 2019
Logstash : Elasticsearch input plugin not working with a remote server Logstash	2	815	January 12, 2022

Add ability to cache HTTP redirects in Elasticsearch output

Related topics