What's the ideal IdleTimeout of AWS ELB to avoid GatewayTimeout?

Context

We are running an elasticsearch cluster with three nodes on AWS EC2. Each node uses the AWS Cloud Plugin to discover each other. In front of the cluster, we put a private elastic load balancer (ELB), so our rails apps can talk to that ELB to find the elasticsearch cluster. Currently we are using elasticsearch version 1.4.4.

Problem

Regularly - but not often - we receive status code 504 - GatewayTimeout from the ELB.

Question

What's the ideal IdleTimeout of the ELB, which is used in front of an elasticsearch cluster?

AWS describes in its Developer Guide how to set the idle timeout. The only thing I found concerning keep_alive timeouts in elasticsearch is in this ThreadPool source code. But if I configure the IdleTimeout of the ELB to 29 seconds the 504 error still occurs.

So what's the keep_alive value of elasticsearch?

I received an answer in the IRC channel:

the keepalive timeout is controlled by the kernel settings. ES doesn't control that. we only set the SO_KEEPALIVE option on the socket. so it depends on the kernel settings.

1 Like