Elasticsearch Java client much slower than rest call

DMAC · November 13, 2012, 7:30pm

Hi

We are starting to use the Java API in ElasticSearch. The only problem is
that the queries seem to take much longer to retrieve data than simply
using curl.

Our development server(19.08) has very small index (2000 documents, with 8
fields)

When making a call to retrieve ~1200 documents it takes 17 seconds to run a
query
versus < 1 second to get the same result using curl

Here is the code I am using to test ES

LOG.info(String.format("Initializing connection to ElasticSearch %s/%s on
%d", host, clusterName, port));
settings = ImmutableSettings.settingsBuilder()
.put("client.transport.sniff",true).build();
client = new TransportClient(settings).addTransportAddress(new
InetSocketTransportAddress(host, port));
searchRequest = client.prepareSearch(clusterName);
mapper = new SerObjectMapper();

BoolQueryBuilder query = boolQuery();
for(String term : keywordList)
query.should(fieldQuery("body", term));

long start = System.currentTimeMillis();
/** From here /
SearchResponse response = client.prepareSearch(clusterName)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(query).setFrom(0).setSize(limit).setExplain(true)
.execute()
.actionGet();
/* To Here takes 17 seconds in Java. */
SearchHit[] docs = response.getHits().getHits();
System.err.println("Query took " + (System.currentTimeMillis()-start));
for(SearchHit doc : docs)
urlList.add((String)doc.getSource().get("url"));
LOG.info(String.format("Returning %d results", urlList.size()));

I wonder if you could point out what I am doing wrong?

Thanks in advance

D.

--

jprante · November 13, 2012, 7:53pm

Switch off explain, setExplain(false)

Unfortunately, this is in the
docs Elasticsearch Platform — Find real-time answers at scale | Elastic but
it's not the default, only an optional setting.

Best regards,

Jörg

On Tuesday, November 13, 2012 8:30:53 PM UTC+1, DMAC wrote:

Hi

We are starting to use the Java API in Elasticsearch. The only problem is
that the queries seem to take much longer to retrieve data than simply
using curl.

Our development server(19.08) has very small index (2000 documents, with 8
fields)

When making a call to retrieve ~1200 documents it takes 17 seconds to run
a query
versus < 1 second to get the same result using curl

Here is the code I am using to test ES

LOG.info(String.format("Initializing connection to Elasticsearch %s/%s on
%d", host, clusterName, port));
settings = ImmutableSettings.settingsBuilder()
.put("client.transport.sniff",true).build();
client = new TransportClient(settings).addTransportAddress(new
InetSocketTransportAddress(host, port));
searchRequest = client.prepareSearch(clusterName);
mapper = new SerObjectMapper();

BoolQueryBuilder query = boolQuery();
for(String term : keywordList)
query.should(fieldQuery("body", term));

long start = System.currentTimeMillis();
/** From here /
SearchResponse response = client.prepareSearch(clusterName)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(query).setFrom(0).setSize(limit).setExplain(true)
.execute()
.actionGet();
/* To Here takes 17 seconds in Java. */
SearchHit docs = response.getHits().getHits();
System.err.println("Query took " + (System.currentTimeMillis()-start));
for(SearchHit doc : docs)
urlList.add((String)doc.getSource().get("url"));
LOG.info(String.format("Returning %d results", urlList.size()));

I wonder if you could point out what I am doing wrong?

Thanks in advance

D.

--

Derry_O_Sullivan · November 14, 2012, 8:41am

2 other points on this.

I'm not sure what limit is (1200)? but returning that many values
(versus returning the default of 10) makes a big difference
Are you doing exactly the same search in the REST call (e.g.
DFS_QUERY_THEN_SEARCH search type, num results etc)?

We have done lots of testing with both http/rest with lots of search
types/limits and i don't think i've every seen such a difference (or
anything near that) in terms of timings. (using ES 0.19.9 over a multi-node
cluster with millions of docs)

On Tuesday, 13 November 2012 19:30:53 UTC, DMAC wrote:

Hi

We are starting to use the Java API in Elasticsearch. The only problem is
that the queries seem to take much longer to retrieve data than simply
using curl.

Our development server(19.08) has very small index (2000 documents, with 8
fields)

When making a call to retrieve ~1200 documents it takes 17 seconds to run
a query
versus < 1 second to get the same result using curl

Here is the code I am using to test ES

LOG.info(String.format("Initializing connection to Elasticsearch %s/%s on
%d", host, clusterName, port));
settings = ImmutableSettings.settingsBuilder()
.put("client.transport.sniff",true).build();
client = new TransportClient(settings).addTransportAddress(new
InetSocketTransportAddress(host, port));
searchRequest = client.prepareSearch(clusterName);
mapper = new SerObjectMapper();

BoolQueryBuilder query = boolQuery();
for(String term : keywordList)
query.should(fieldQuery("body", term));

long start = System.currentTimeMillis();
/** From here /
SearchResponse response = client.prepareSearch(clusterName)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(query).setFrom(0).setSize(limit).setExplain(true)
.execute()
.actionGet();
/* To Here takes 17 seconds in Java. */
SearchHit docs = response.getHits().getHits();
System.err.println("Query took " + (System.currentTimeMillis()-start));
for(SearchHit doc : docs)
urlList.add((String)doc.getSource().get("url"));
LOG.info(String.format("Returning %d results", urlList.size()));

I wonder if you could point out what I am doing wrong?

Thanks in advance

D.

--

DMAC · December 4, 2012, 10:30am

Hi,

Thanks. Sorry for the slow response. It turns out that it was my fault, it was the way I was serialising the data.

Regards

D.

On 14 Nov 2012, at 08:41, Derry O' Sullivan wrote:

2 other points on this.

I'm not sure what limit is (1200)? but returning that many values (versus returning the default of 10) makes a big difference

Are you doing exactly the same search in the REST call (e.g. DFS_QUERY_THEN_SEARCH search type, num results etc)?

We have done lots of testing with both http/rest with lots of search types/limits and i don't think i've every seen such a difference (or anything near that) in terms of timings. (using ES 0.19.9 over a multi-node cluster with millions of docs)

On Tuesday, 13 November 2012 19:30:53 UTC, DMAC wrote:
Hi

We are starting to use the Java API in Elasticsearch. The only problem is that the queries seem to take much longer to retrieve data than simply using curl.

Our development server(19.08) has very small index (2000 documents, with 8 fields)

When making a call to retrieve ~1200 documents it takes 17 seconds to run a query
versus < 1 second to get the same result using curl

Here is the code I am using to test ES

LOG.info(String.format("Initializing connection to Elasticsearch %s/%s on %d", host, clusterName, port));
settings = ImmutableSettings.settingsBuilder()
.put("client.transport.sniff",true).build();
client = new TransportClient(settings).addTransportAddress(new
InetSocketTransportAddress(host, port));
searchRequest = client.prepareSearch(clusterName);
mapper = new SerObjectMapper();

BoolQueryBuilder query = boolQuery();
for(String term : keywordList)
query.should(fieldQuery("body", term));

long start = System.currentTimeMillis();
/** From here /
SearchResponse response = client.prepareSearch(clusterName)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(query).setFrom(0).setSize(limit).setExplain(true)
.execute()
.actionGet();
/* To Here takes 17 seconds in Java. */
SearchHit docs = response.getHits().getHits();
System.err.println("Query took " + (System.currentTimeMillis()-start));
for(SearchHit doc : docs)
urlList.add((String)doc.getSource().get("url"));
LOG.info(String.format("Returning %d results", urlList.size()));

I wonder if you could point out what I am doing wrong?

Thanks in advance

D.

--

--

NRS · October 3, 2016, 12:31pm

I am facing the same issue . When I use curl to get response , it takes around 40-50 ms . But When same query is executed using execute().actionGet() using TransportClient , it takes around 1000 ms. Trying to figure out the root cause and the solution of this problem . Could anybody pls help me out with this

dadoonet · October 3, 2016, 1:15pm

You should better open a new thread and describe exactly what you are doing and in which context (version).
I'm fairly sure you are not doing the exact same thing.

NRS · October 3, 2016, 3:05pm

Thanks for the reply.
Open new thread as Java Transport Client slower than curl request

Topic		Replies	Views
Native Java API is slower than the REST one Elasticsearch	3	1491	July 5, 2017
Java Transport Client slower than curl request Elasticsearch	10	2850	July 5, 2017
Elasticsearch-java query slowly Elasticsearch language-clients	13	520	August 29, 2023
Java Client slower than http Elasticsearch	9	3241	July 6, 2017
Why is the Java api query slower than the rest call? Elasticsearch	9	1797	July 6, 2017

Elasticsearch Java client much slower than rest call

Related topics