Performance of term query with sorting

Hi All,

we are using 0.20.2 version of ES and running five nodes of ES each has 32
GB RAM and 8 cores.

we have indexed 60 millions(100 GB data) records into ES. We need to fire a
term query with sorting.

If we are firing term query without sorting, then result are coming in 3
secs ,but , a term query with sorting takes around 150 secs.
Please somebody could explain the Mechanism the elasticsearch follows while
performing the query(with and without sort both) ?

we have already optimized the index via external call.

Also , please suggest some optimization parameters.

Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ankit,

Could you please give insight in the layout of your data and what the query
looks like?
How big is the resultset you're hitting with the query, total_hits.

For more insight into how querying is done, have a look at:
http://www.elasticsearch.org/guide/reference/api/search/search-type/

I believe the default search type is query_then_fetch, however this is
missing in the documentation.

Jaap Taal

[ Q42 BV | tel 070 44523 42 | direct 070 44523 65 | http://q42.nl |
Waldorpstraat 17F, Den Haag | Vijzelstraat 72 unit 4.23, Amsterdam | KvK
30164662 ]

On Thu, Apr 4, 2013 at 12:42 PM, Ankit Jain ankitjaincs06@gmail.com wrote:

Hi All,

we are using 0.20.2 version of ES and running five nodes of ES each has 32
GB RAM and 8 cores.

we have indexed 60 millions(100 GB data) records into ES. We need to fire
a term query with sorting.

If we are firing term query without sorting, then result are coming in 3
secs ,but , a term query with sorting takes around 150 secs.
Please somebody could explain the Mechanism the elasticsearch follows
while performing the query(with and without sort both) ?

we have already optimized the index via external call.

Also , please suggest some optimization parameters.

Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Jaap,

we have indexed 60 millions records, each record contains 31
columns(rowId,c0,c1,c2,c3..,c29).

Below is our index mapping:

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Below is sample example that we are using to retrieve 10000 records from elasticsearch.

public void SearchQuery() {
	QueryBuilder qb = QueryBuilders.matchAllQuery();

	QueryBuilder queryBuilder = QueryBuilders.boolQuery()
			.must(termQuery("c29", "udp"))
			;
	
	
	SearchRequestBuilder searchRequestBuilder = client
			.prepareSearch("89854","89855","89853")
			.setSearchType(SearchType.QUERY_AND_FETCH)
			.setQuery(queryBuilder)
			.setSize(10000);
	searchRequestBuilder.addSort("c0", SortOrder.DESC);
	SearchResponse response = searchRequestBuilder.execute().actionGet();
	SearchHits hits = response.getHits();
	System.out.println("Total Hits : "+hits.getTotalHits()); // output is 2
	int i = 0;
	for (SearchHit hit : hits) {
		System.out.println("id = " + hit.getId() + ""+i++); // prints out the id of the
	}

	long now = System.currentTimeMillis();
	long diff = now - start;
	Calendar cal = Calendar.getInstance();
	SimpleDateFormat sdf = new SimpleDateFormat(DATE_FORMAT_NOW);

	System.out.println("Time Taken in millisecs = "
			+ new Long(diff).toString());
	System.out.println("Done");
}

Thanks & Regards,
Ankit Jain

On Thursday, 4 April 2013 17:00:52 UTC+5:30, Jaap Taal wrote:

Ankit,

Could you please give insight in the layout of your data and what the
query looks like?
How big is the resultset you're hitting with the query, total_hits.

For more insight into how querying is done, have a look at:
http://www.elasticsearch.org/guide/reference/api/search/search-type/

I believe the default search type is query_then_fetch, however this is
missing in the documentation.

Jaap Taal

[ Q42 BV | tel 070 44523 42 | direct 070 44523 65 | http://q42.nl |
Waldorpstraat 17F, Den Haag | Vijzelstraat 72 unit 4.23, Amsterdam | KvK
30164662 ]

On Thu, Apr 4, 2013 at 12:42 PM, Ankit Jain <ankitj...@gmail.com<javascript:>

wrote:

Hi All,

we are using 0.20.2 version of ES and running five nodes of ES each has
32 GB RAM and 8 cores.

we have indexed 60 millions(100 GB data) records into ES. We need to fire
a term query with sorting.

If we are firing term query without sorting, then result are coming in 3
secs ,but , a term query with sorting takes around 150 secs.
Please somebody could explain the Mechanism the elasticsearch follows
while performing the query(with and without sort both) ?

we have already optimized the index via external call.

Also , please suggest some optimization parameters.

Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

If you need to extract so much records, prefer the scan & scroll feature.
http://www.elasticsearch.org/guide/reference/api/search/scroll/
You probably don't need to display so mush results to a single user, do you?

That said, when sorting, ES has to load all values from your field c0 and sort them on each shard. Then, sort the resultset again on the gathering node.
It could explain things here if c0 has 60 000 000 different values!

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 avr. 2013 à 14:33, Ankit Jain ankitjaincs06@gmail.com a écrit :

Hi Jaap,

we have indexed 60 millions records, each record contains 31 columns(rowId,c0,c1,c2,c3..,c29).

Below is our index mapping:

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Below is sample example that we are using to retrieve 10000 records from elasticsearch.

public void SearchQuery() {
QueryBuilder qb = QueryBuilders.matchAllQuery();

  QueryBuilder queryBuilder = QueryBuilders.boolQuery()
  		.must(termQuery("c29", "udp"))
  		;
  
  
  SearchRequestBuilder searchRequestBuilder = client
  		.prepareSearch("89854","89855","89853")
  		.setSearchType(SearchType.QUERY_AND_FETCH)
  		.setQuery(queryBuilder)
  		.setSize(10000);
  searchRequestBuilder.addSort("c0", SortOrder.DESC);
  SearchResponse response = searchRequestBuilder.execute().actionGet();
  SearchHits hits = response.getHits();
  System.out.println("Total Hits : "+hits.getTotalHits()); // output is 2
  int i = 0;
  for (SearchHit hit : hits) {
  	System.out.println("id = " + hit.getId() + ""+i++); // prints out the id of the
  }

  long now = System.currentTimeMillis();
  long diff = now - start;
  Calendar cal = Calendar.getInstance();
  SimpleDateFormat sdf = new SimpleDateFormat(DATE_FORMAT_NOW);

  System.out.println("Time Taken in millisecs = "
  		+ new Long(diff).toString());
  System.out.println("Done");

}

Thanks & Regards,
Ankit Jain

On Thursday, 4 April 2013 17:00:52 UTC+5:30, Jaap Taal wrote:
Ankit,

Could you please give insight in the layout of your data and what the query looks like?
How big is the resultset you're hitting with the query, total_hits.

For more insight into how querying is done, have a look at:
http://www.elasticsearch.org/guide/reference/api/search/search-type/

I believe the default search type is query_then_fetch, however this is missing in the documentation.

Jaap Taal

[ Q42 BV | tel 070 44523 42 | direct 070 44523 65 | http://q42.nl | Waldorpstraat 17F, Den Haag | Vijzelstraat 72 unit 4.23, Amsterdam | KvK 30164662 ]

On Thu, Apr 4, 2013 at 12:42 PM, Ankit Jain ankitj...@gmail.com wrote:
Hi All,

we are using 0.20.2 version of ES and running five nodes of ES each has 32 GB RAM and 8 cores.

we have indexed 60 millions(100 GB data) records into ES. We need to fire a term query with sorting.

If we are firing term query without sorting, then result are coming in 3 secs ,but , a term query with sorting takes around 150 secs.
Please somebody could explain the Mechanism the elasticsearch follows while performing the query(with and without sort both) ?

we have already optimized the index via external call.

Also , please suggest some optimization parameters.

Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.