Java Client slower than http

We are currently using TransportClient and observing that query response times are considerably slower than using http (through sense plugin). Below is code snippet for initializing TransportClient:

> void createClient(Map<String, String> props) {
    Settings settings = ImmutableSettings.settingsBuilder().put(props).build();
    transportClient = new TransportClient(settings);
    String[] hosts = getTransportHosts().split(",");
    for (String host : hosts) {
        int index = host.indexOf(COLON);
        String hostname = host.substring(0, index);
        String port = host.substring(index + 1);
        transportClient.addTransportAddress(new InetSocketTransportAddress(hostname.trim(), Integer.parseInt(port.trim())));
    }
}

Map<String, String> createClientConfigSettings() {
    Map<String, String> clientProperties = new HashMap<>();
    clientProperties.put(CLUSTER_NAME_PROPERTY, getClusterName());
    clientProperties.put("client.transport.ping_timeout", "10s");
    clientProperties.put("client.transport.nodes_sampler_interval", "10s");
    String nodeName = processInstanceName();
    clientProperties.put("node.name", nodeName);
    return clientProperties;
}

Please advise.

Thanks,
Prateek

What is "considerably slower"?
What are the queries you execute?
How do you measure query response time?

It's not visible in the code.

The following query has been generated using SearchRequestBuilder.toString() API from the elastic search client library version 1.4.3

GET /index-1/_search
{
 "from": 0,
 "size": 10,
 "query": {
   "filtered": {
     "query": {
       "multi_match": {
         "query": "2015 Chevrolet Cruze",
         "fields": [
           "id",
           "field-1",
           "field-2",
           "field-3",
           "field-4",
           "field-etc"
         ],
         "analyzer": "my_synonym",
         "type": "cross_fields"
       }
     },
     "filter": {
       "bool": {
         "must": [
           {
             "terms": {
               "field-7": [
                 "val-1"
               ]
             }
           },
           {
             "terms": {
               "status": [
                 "stock",
                 "transit"
               ]
             }
           }
         ]
       }
     }
   }
 }
}

Output using sense is:

{
  "took": 26,
  "timed_out": false,
  "_shards": {
     "total": 7,
     "successful": 7,
     "failed": 0
  },
  "hits": {
     "total": 346,
..

When same query is executed using Java, our response times are in 100ms to 150 ms range. That is the 95th percentile value. For Java, we are retrieving response times from org.elasticsearch.action.search.SearchResponse.getTookInMillis().

We are currently using elastic 1.4.3 version.

And search type https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-search-type.html is the same in both requests?

Yes, QUERY_THEN_FETCH. We have also referred to link and have set explain to false.

I'm willing to dive deeper into this.

Things would be easier to trace with query profiling https://github.com/elastic/elasticsearch/pull/6699 but it's not ready yet.

The fact is that HTTP uses Java under the hood so HTTP can not be faster than Java.

Maybe you can try to repeat the case with node client and find out if it makes any difference?

Is there any additional parameter that gets passed by the Java client versus HTTP/Sense?

The results given above are with 11 primaries and 1 replica. We repeated the tests with 7 primaries and 1 replica. But the results are the same. This is why we suspect that there is some additional parameter that is sent as part of the call that is not getting set using the HTTP request.

Additionally, with elasticsearch 1.6.0 we see that the SearchRequestBuilder.toString does not give the complete picture of what is sent to the elasticsearch nodes. https://github.com/elastic/elasticsearch/pull/9944 and https://github.com/elastic/elasticsearch/issues/5576. This adds to the suspicion that there is more happening when building queries/filters using the Java client versus setting it via HTTP.

We can certainly try using Node Client but why do you think it may behave differently?

Node client has a different cluster rendezvous. It sends discovery request (multicast or unicast ping), then joins the cluster and is visible to all nodes.

Transport client tries to connect to TCP/IP host addresses and uses these hosts as a proxy to access the cluster nodes. It does not join the cluster. There is one more hop in the scatter/gather process.

There are no "hidden" parameters in the java client.

I assume there are different times measured. The "took" time is only the time used for executing the search, after search request is received, and before the search response is sent.

If you want to measure the true response time, you will have to add network packet transmission time of the response from client side.

Thanks... Help me understand... When you say that "different times are measured"... I didn't quite get this.

When I execute the query using Sense, I get back a JSON response and it has an attribute called took.
Similarly when I execute the same query using elasticsearch client library I get back SearchResponse object. That object has an attribute called tookInMillis.

I have so far assumed that both these respresent the times taken by Elasticsearch to execute the query on its side. It is not the total response time.

I am only interested in knowing the time a request takes to execute on elasticsearch side. I dont care how much time it took for the response to get back to the caller. The time to execute the query on ES via Sense vs the client library should be same, IMO. Meaning "took" and "tookInMillis" should return back the same numbers. Right?

Not sure if it matters to this discussion but my cluster is composed of the following:

  1. 2 client nodes (data=false and master=false)
  2. 3 master nodes (data=false and master=true)
  3. 22 data nodes (data=true and master=false)

My transport client in my java code connects to the 2 client nodes at startup of the JVM. When executing the queries using Sense, I connect only to my client nodes.

Based on what you have described, when I use the transport client or the node client, the result should be the same.