Improve Query Performance

Hi guys,

We have 5 nodes ES cluster each has 32 GB RAM and 8 core.

We are generating 250 millions(250 GB) records per day and indexing into
elasticsearch. We have created a new index for each hours. so, total number
of indexes for each day are 24. We are firing a time range query, so our
query are targeting only selected indexes.

Below is the index mapping.
{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Sample query that we are firing

c1:"abc" and c2="qwe" and range="t1" to "t2", sorting records on the basis
of column co.

The query we are firing takes 40 secs to return the result.

Please suggest, how we can improve the query performance.

Thank you very much in advance.

Thanks,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

What happen if you don't sort?
How many shards do you have per index?
Why do you use so much indexes? Are you rolling index every hour?

Le 6 avr. 2013 à 11:50, Ankit Jain ankitjaincs06@gmail.com a écrit :

Hi guys,

We have 5 nodes ES cluster each has 32 GB RAM and 8 core.

We are generating 250 millions(250 GB) records per day and indexing into elasticsearch. We have created a new index for each hours. so, total number of indexes for each day are 24. We are firing a time range query, so our query are targeting only selected indexes.

Below is the index mapping.
{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Sample query that we are firing

c1:"abc" and c2="qwe" and range="t1" to "t2", sorting records on the basis of column co.

The query we are firing takes 40 secs to return the result.

Please suggest, how we can improve the query performance.

Thank you very much in advance.

Thanks,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

so one thing that jumps to my mind is the range query but I can't really
tell much without seeing a real query, can you paste one?

simon

On Saturday, April 6, 2013 11:50:25 AM UTC+2, Ankit Jain wrote:

Hi guys,

We have 5 nodes ES cluster each has 32 GB RAM and 8 core.

We are generating 250 millions(250 GB) records per day and indexing into
elasticsearch. We have created a new index for each hours. so, total number
of indexes for each day are 24. We are firing a time range query, so our
query are targeting only selected indexes.

Below is the index mapping.

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Sample query that we are firing

c1:"abc" and c2="qwe" and range="t1" to "t2", sorting records on the basis
of column co.

The query we are firing takes 40 secs to return the result.

Please suggest, how we can improve the query performance.

Thank you very much in advance.

Thanks,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi David,

Thanks for response

Why do you use so much indexes? Are you rolling index every hour?
We need to fire a range query on the basis of hours, so that we have to
look only limited indexes (on small data) instead of one big index.

How many shards do you have per index?
5 shards for each index.

What happen if you don't sort?
The result come much faster.

Please suggest some optimization to increase query performance.

Thanks & Regards,
Ankit Jain

On Saturday, 6 April 2013 16:02:03 UTC+5:30, David Pilato wrote:

What happen if you don't sort?
How many shards do you have per index?
Why do you use so much indexes? Are you rolling index every hour?

Le 6 avr. 2013 à 11:50, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :

Hi guys,

We have 5 nodes ES cluster each has 32 GB RAM and 8 core.

We are generating 250 millions(250 GB) records per day and indexing into
elasticsearch. We have created a new index for each hours. so, total number
of indexes for each day are 24. We are firing a time range query, so our
query are targeting only selected indexes.

Below is the index mapping.

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Sample query that we are firing

c1:"abc" and c2="qwe" and range="t1" to "t2", sorting records on the basis
of column co.

The query we are firing takes 40 secs to return the result.

Please suggest, how we can improve the query performance.

Thank you very much in advance.

Thanks,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Simonw,

Thanks for response.

Sample example that I was using.

    long start = System.currentTimeMillis();
    QueryBuilder queryBuilder1 = QueryBuilders.rangeQuery("c0")
            .from("1293868800").to("1293883200");
    QueryBuilder queryBuilder = QueryBuilders.boolQuery()            

            .must(termQuery("c2", "375191")).must(termsQuery("c1", 

"50","51","53","54")).must(termQuery("c24", "IGNORE"))
.must(queryBuilder1);
SearchRequestBuilder searchRequestBuilder = client
.prepareSearch("359408", "359409", "359410", "359411") //
where 359408,359409,359410,359411 are the index name created per hour basis.
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(queryBuilder).setSize(10000);
searchRequestBuilder.addSort("c0", SortOrder.DESC);
SearchResponse response =
searchRequestBuilder.execute().actionGet();
SearchHits hits = response.getHits();
System.out.println("Total Hits : " + hits.getTotalHits());
int i = 0;
for (SearchHit hit : hits) {
//System.out.println("id = " + hit.getId() + "----" + i++);
}

    long now = System.currentTimeMillis();
    long diff = now - start;
    
    Calendar cal = Calendar.getInstance();
    SimpleDateFormat sdf = new SimpleDateFormat(DATE_FORMAT_NOW);

    System.out.println("Time Taken in millisecs = "
            + new Long(diff).toString());
    System.out.println("Done");

Please suggest some optimization.

Thanks,
Ankit

On Sunday, 7 April 2013 00:17:49 UTC+5:30, simonw wrote:

so one thing that jumps to my mind is the range query but I can't really
tell much without seeing a real query, can you paste one?

simon

On Saturday, April 6, 2013 11:50:25 AM UTC+2, Ankit Jain wrote:

Hi guys,

We have 5 nodes ES cluster each has 32 GB RAM and 8 core.

We are generating 250 millions(250 GB) records per day and indexing into
elasticsearch. We have created a new index for each hours. so, total number
of indexes for each day are 24. We are firing a time range query, so our
query are targeting only selected indexes.

Below is the index mapping.

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Sample query that we are firing

c1:"abc" and c2="qwe" and range="t1" to "t2", sorting records on the
basis of column co.

The query we are firing takes 40 secs to return the result.

Please suggest, how we can improve the query performance.

Thank you very much in advance.

Thanks,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ok. So you are running a sort on 4 indices. Each one has 5 shards. That means that you are running 20 requests at a time.
I would probably try to reduce the number of shards to one and start to run request on a single index at first and see if the response time is what you are expecting.

Then, try to add another index to the request or compare with a multi search request, each one on a different index.
And see where it goes.

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 8 avr. 2013 à 15:06, Ankit Jain ankitjaincs06@gmail.com a écrit :

Hi David,

Thanks for response

Why do you use so much indexes? Are you rolling index every hour?
We need to fire a range query on the basis of hours, so that we have to look only limited indexes (on small data) instead of one big index.

How many shards do you have per index?
5 shards for each index.

What happen if you don't sort?
The result come much faster.

Please suggest some optimization to increase query performance.

Thanks & Regards,
Ankit Jain

On Saturday, 6 April 2013 16:02:03 UTC+5:30, David Pilato wrote:
What happen if you don't sort?
How many shards do you have per index?
Why do you use so much indexes? Are you rolling index every hour?

Le 6 avr. 2013 à 11:50, Ankit Jain ankitj...@gmail.com a écrit :

Hi guys,

We have 5 nodes ES cluster each has 32 GB RAM and 8 core.

We are generating 250 millions(250 GB) records per day and indexing into elasticsearch. We have created a new index for each hours. so, total number of indexes for each day are 24. We are firing a time range query, so our query are targeting only selected indexes.

Below is the index mapping.
{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Sample query that we are firing

c1:"abc" and c2="qwe" and range="t1" to "t2", sorting records on the basis of column co.

The query we are firing takes 40 secs to return the result.

Please suggest, how we can improve the query performance.

Thank you very much in advance.

Thanks,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi David,

Thanks for your response David.

I have five nodes cluster and shards are equally distributed on each
node(each node has 1 shard of each index). I tried with one shard, but the
performance was poor then 5 shards case.

Please suggest some optimization.

Thanks,
Ankit Jain
iLabs

On Monday, 8 April 2013 18:46:58 UTC+5:30, David Pilato wrote:

Ok. So you are running a sort on 4 indices. Each one has 5 shards. That
means that you are running 20 requests at a time.
I would probably try to reduce the number of shards to one and start to
run request on a single index at first and see if the response time is what
you are expecting.

Then, try to add another index to the request or compare with a multi
search request, each one on a different index.
And see where it goes.

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 8 avr. 2013 à 15:06, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :

Hi David,

Thanks for response

Why do you use so much indexes? Are you rolling index every hour?
We need to fire a range query on the basis of hours, so that we have to
look only limited indexes (on small data) instead of one big index.

How many shards do you have per index?
5 shards for each index.

What happen if you don't sort?
The result come much faster.

Please suggest some optimization to increase query performance.

Thanks & Regards,
Ankit Jain

On Saturday, 6 April 2013 16:02:03 UTC+5:30, David Pilato wrote:

What happen if you don't sort?
How many shards do you have per index?
Why do you use so much indexes? Are you rolling index every hour?

Le 6 avr. 2013 à 11:50, Ankit Jain ankitj...@gmail.com a écrit :

Hi guys,

We have 5 nodes ES cluster each has 32 GB RAM and 8 core.

We are generating 250 millions(250 GB) records per day and indexing into
elasticsearch. We have created a new index for each hours. so, total number
of indexes for each day are 24. We are firing a time range query, so our
query are targeting only selected indexes.

Below is the index mapping.

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Sample query that we are firing

c1:"abc" and c2="qwe" and range="t1" to "t2", sorting records on the
basis of column co.

The query we are firing takes 40 secs to return the result.

Please suggest, how we can improve the query performance.

Thank you very much in advance.

Thanks,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

if you remove the range query how much per improvement do you see?

simon

On Monday, April 8, 2013 3:08:27 PM UTC+2, Ankit Jain wrote:

Hi Simonw,

Thanks for response.

Sample example that I was using.

    long start = System.currentTimeMillis();
    QueryBuilder queryBuilder1 = QueryBuilders.rangeQuery("c0")
            .from("1293868800").to("1293883200");
    QueryBuilder queryBuilder = QueryBuilders.boolQuery()            

            .must(termQuery("c2", "375191")).must(termsQuery("c1", 

"50","51","53","54")).must(termQuery("c24", "IGNORE"))
.must(queryBuilder1);
SearchRequestBuilder searchRequestBuilder = client
.prepareSearch("359408", "359409", "359410", "359411") //
where 359408,359409,359410,359411 are the index name created per hour basis.
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(queryBuilder).setSize(10000);
searchRequestBuilder.addSort("c0", SortOrder.DESC);
SearchResponse response =
searchRequestBuilder.execute().actionGet();
SearchHits hits = response.getHits();
System.out.println("Total Hits : " + hits.getTotalHits());
int i = 0;
for (SearchHit hit : hits) {
//System.out.println("id = " + hit.getId() + "----" + i++);
}

    long now = System.currentTimeMillis();
    long diff = now - start;
    
    Calendar cal = Calendar.getInstance();
    SimpleDateFormat sdf = new SimpleDateFormat(DATE_FORMAT_NOW);

    System.out.println("Time Taken in millisecs = "
            + new Long(diff).toString());
    System.out.println("Done");

Please suggest some optimization.

Thanks,
Ankit

On Sunday, 7 April 2013 00:17:49 UTC+5:30, simonw wrote:

so one thing that jumps to my mind is the range query but I can't really
tell much without seeing a real query, can you paste one?

simon

On Saturday, April 6, 2013 11:50:25 AM UTC+2, Ankit Jain wrote:

Hi guys,

We have 5 nodes ES cluster each has 32 GB RAM and 8 core.

We are generating 250 millions(250 GB) records per day and indexing into
elasticsearch. We have created a new index for each hours. so, total number
of indexes for each day are 24. We are firing a time range query, so our
query are targeting only selected indexes.

Below is the index mapping.

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Sample query that we are firing

c1:"abc" and c2="qwe" and range="t1" to "t2", sorting records on the
basis of column co.

The query we are firing takes 40 secs to return the result.

Please suggest, how we can improve the query performance.

Thank you very much in advance.

Thanks,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

Those are just my two cents, but when you say "We are firing a time range
query, so our query are targeting only selected indexes."

Are you sure about this? This is unclear in your post.

I mean, ok, right, the relevant data of a timerange query is gathered on a
few indices. But Elasticsearch is not supposed to know that. If you just
ask your cluster for result, it will have to check for results in each and
every indices, even if there are none, thus wasting time.

Do you specify explicitly on wich indices you are querying?

Then again, it may be a silly question. Sorry if that's the case.

Le samedi 6 avril 2013 11:50:25 UTC+2, Ankit Jain a écrit :

Hi guys,

We have 5 nodes ES cluster each has 32 GB RAM and 8 core.

We are generating 250 millions(250 GB) records per day and indexing into
elasticsearch. We have created a new index for each hours. so, total number
of indexes for each day are 24. We are firing a time range query, so our
query are targeting only selected indexes.

Below is the index mapping.

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Sample query that we are firing

c1:"abc" and c2="qwe" and range="t1" to "t2", sorting records on the basis
of column co.

The query we are firing takes 40 secs to return the result.

Please suggest, how we can improve the query performance.

Thank you very much in advance.

Thanks,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

my suspicion is that the range is actually a string range and not a numeric
range, can you provide the mapping of your index here?

simon

On Tuesday, April 9, 2013 1:45:27 PM UTC+2, DH wrote:

Hi,

Those are just my two cents, but when you say "We are firing a time range
query, so our query are targeting only selected indexes."

Are you sure about this? This is unclear in your post.

I mean, ok, right, the relevant data of a timerange query is gathered on a
few indices. But Elasticsearch is not supposed to know that. If you just
ask your cluster for result, it will have to check for results in each and
every indices, even if there are none, thus wasting time.

Do you specify explicitly on wich indices you are querying?

Then again, it may be a silly question. Sorry if that's the case.

Le samedi 6 avril 2013 11:50:25 UTC+2, Ankit Jain a écrit :

Hi guys,

We have 5 nodes ES cluster each has 32 GB RAM and 8 core.

We are generating 250 millions(250 GB) records per day and indexing into
elasticsearch. We have created a new index for each hours. so, total number
of indexes for each day are 24. We are firing a time range query, so our
query are targeting only selected indexes.

Below is the index mapping.

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}

Sample query that we are firing

c1:"abc" and c2="qwe" and range="t1" to "t2", sorting records on the
basis of column co.

The query we are firing takes 40 secs to return the result.

Please suggest, how we can improve the query performance.

Thank you very much in advance.

Thanks,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.