How to achieve Long live http connections using an elasticsearch client (e.g Tire gem)

I am doing bulk indexing using Tire Gem as the client for Elasticsearch

index = Tire::Index.new('oldskool')
index.bulk_store(bulk_values)

I monitor the HTTP connections on my Elasticsearch cluster by using the
http monitor API,

curl 'localhost:9200/_nodes/http/stats'

In the JSON response that I get ,

..."http":{"current_open" : 10, "total_opened" : 18345}

I observed that the "total_opened" field value goes on increasing rapidly.
I think this means that the Tire gem is not using persistent connections
while bulk indexing( Please correct me if I am wrong ).

How can I use Tire Gem to make persistent connections with Elasticsearch
while doing bulk indexing?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I see your question here, and on StackOverflow. And the more I think about
it, the more I agree that you have asked an excellent question.

Even though Elasticsearch's HTTP interface is "RESTful", that does not mean
it's not able to support persistent connections. ES uses Netty under the
covers, and Netty supports HTTP 1.1 persistent connections by default.

So this is likely a Tire Gem issue, but it would be good to know how
Elasticsearch sets up Netty for its HTTP interface. For example, I added
session timeouts to my own Netty-based RESTful server, but with Netty this
is a bit of a problem. Netty handles requests and responses on different
streams, and so consider the case where the response takes a LONG TIME to
complete. The client is, of course, waiting for this long response before
it sends the next request. But the Netty server's reader doesn't know if
the client is idle or just waiting for the response. In other words, in a
Netty server, the left hand (reading client requests) has no idea what the
right hand (handling client requests and generating the response) is doing.

So we had to very carefully adjust our client and server timeouts based on
the behavior of HTTP persistent connections, the Netty server and my
handler's response times, and our network firewall idle session timeout.

This is over and above your question. But, if you want to get Tire Gem to
use persistent connections, then you'll suddenly need to worry about all or
most of this. Maybe the Tire Gem developers decided that creating a new
HTTP session for each bulk request (not document, but clump of documents)
was minimal overhead and avoided the need to properly handle all of the
nuances of HTTP 1.1 persistent connections.

Brian

On Thursday, August 8, 2013 10:39:47 PM UTC-4, hrishikesh prabhune wrote:

I am doing bulk indexing using Tire Gem as the client for Elasticsearch

index = Tire::Index.new('oldskool')
index.bulk_store(bulk_values)

I monitor the HTTP connections on my Elasticsearch cluster by using the
http monitor API,

curl 'localhost:9200/_nodes/http/stats'

In the JSON response that I get ,

..."http":{"current_open" : 10, "total_opened" : 18345}

I observed that the "total_opened" field value goes on increasing rapidly.
I think this means that the Tire gem is not using persistent connections
while bulk indexing( Please correct me if I am wrong ).

How can I use Tire Gem to make persistent connections with Elasticsearch
while doing bulk indexing?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

ES is counting Netty upstream ChannelEvents "channelOpen" [1] and provides
them as value in HTTP stats in a channel handler [2]. It is not counting
HTTP connections. ES is counting how Netty creates channels on TCP
connections, which may be persistent/keepalive or not.

ES HTTP server is also providing TCP keepalive by default [3], and also
socket reuse (except on Windows). The settings in [4] are also valid for
the HTTP server in ES.

Check your system network status with netstat if there are suspicious long
lasting socket connections and the number of CLOSE_WAIT status is not
acceptable.

Jörg

[1]
http://docs.jboss.org/netty/3.2/api/org/jboss/netty/channel/ChannelEvent.html

[2]
https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/common/netty/OpenChannelsHandler.java
[3]
https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/transport/netty/NettyTransport.java
[4] http://www.elasticsearch.org/guide/reference/modules/network/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.