Help debugging performance issues

Hello,

We run an elastic search cluster that we've been experiencing random slow
queries on. Any help in debugging the source of the slow query would be
appreciated. Basically, we are getting occasional searches that are 1.5 ->
12 seconds. Most searches are much faster. The goal would be to return
search results in less than 1 second.

We run Elastic Search 0.90.2 that we've recently installed, and we
re-indexed all of our data (as opposed to upgrading the existing .19.11
systems). We have over 180,000,000 documents. We are configured to run 5
shards with 1 replica. Our systems are deployed in Amazon EC2 on 2
m1.xlarge instances. Both Elastic Search cluster members run with a heap
of 10Gig. CPU and I/O load on the systems are very low. Indexing
performance seems excellent. We run a refresh interval of 10 seconds.
Current disk usage for the data is under 140gig across both search
instances.

Here is our schema:

{
"cluster_name": "Open",
"master_node": "2MascUtjSCmSLyrGlxiBYg",
"blocks": {},
"nodes": {
"2MascUtjSCmSLyrGlxiBYg": {
"name": "Black Panther",
"transport_address": "inet[/10.0.4.51:9300]",
"attributes": {}
},
"HTGsGWe9QfaqdJbvzE9qaw": {
"name": "Darkdevil",
"transport_address": "inet[/10.0.8.51:9300]",
"attributes": {}
}
},
"metadata": {
"templates": {},
"indices": {
"people_v1": {
"state": "open",
"settings": {
"index.number_of_shards": "5",
"index.number_of_replicas": "1",
"index.version.created": "900299",
"index.analysis.filter.my_ngram.min_gram": "1",
"index.analysis.filter.my_metaphone.replace": "true",
"index.analysis.filter.my_metaphone.type": "phonetic",
"index.analysis.analyzer.metaphone_analyzer.tokenizer":
"standard",
"index.analysis.filter.nickname_filter.synonyms_path":
"analysis/names1.2.csv",
"index.analysis.filter.nickname_filter.type": "synonym",
"index.analysis.analyzer.ngram_analyzer.tokenizer":
"standard",
"index.analysis.filter.my_ngram.type": "edgeNGram",
"index.analysis.analyzer.email_analyzer.tokenizer":
"uax_url_email",
"index.analysis.analyzer.nickname_analyzer.tokenizer":
"standard",
"index.analysis.filter.my_metaphone.encoder":
"metaphone",
"index.analysis.analyzer.standard_analyzer.tokenizer":
"standard",
"index.analysis.analyzer.metaphone_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.metaphone_analyzer.filter.0":
"standard",
"index.analysis.filter.my_ngram.max_gram": "5",
"index.analysis.analyzer.nickname_analyzer.filter.0":
"standard",
"index.analysis.analyzer.domain_analyzer.tokenizer":
"uax_url_email",
"index.analysis.analyzer.ngram_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.ngram_analyzer.filter.0":
"standard",
"index.analysis.analyzer.nickname_analyzer.filter.2":
"nickname_filter",
"index.analysis.analyzer.nickname_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.standard_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.metaphone_analyzer.filter.2":
"my_metaphone",
"index.analysis.analyzer.standard_analyzer.filter.0":
"standard",
"index.analysis.analyzer.ngram_analyzer.filter.2":
"my_ngram",
"index.refresh_interval": "10s"
},
"mappings": {
"person": {
"_source": {
"compress": true
},
"_routing": {
"path": "userid",
"required": true
},
"properties": {
"email": {
"analyzer": "email_analyzer",
"type": "string"
},
"nameOf": {
"index": "no",
"type": "string"
},
"company": {
"type": "multi_field",
"fields": {
"company_ngram": {
"include_in_all": false,
"analyzer": "ngram_analyzer",
"type": "string"
},
"company": {
"analyzer": "standard",
"type": "string"
}
}
},
"userid": {
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs",
"type": "string"
},
"score": {
"type": "integer"
},
"domain": {
"analyzer": "domain_analyzer",
"type": "string"
},
"ownerid": {
"index": "no",
"type": "string"
},
"ownernameof": {
"index": "no",
"type": "string"
},
"key": {
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs",
"type": "string"
},
"namepart": {
"type": "multi_field",
"fields": {
"nickname": {
"include_in_all": false,
"analyzer": "nickname_analyzer",
"type": "string"
},
"metaphone": {
"include_in_all": false,
"analyzer": "metaphone_analyzer",
"boost": 0.8,
"type": "string"
},
"ngram": {
"include_in_all": false,
"analyzer": "ngram_analyzer",
"type": "string"
},
"namepart": {
"analyzer": "standard_analyzer",
"type": "string"
}
}
}
}
}
},
"aliases": []
}
}
},

We use routing in our implementation, and all the data being searched are
segmented by userid.

Our current searches are similar to the following:

curl -XPOST
'http://localhost:9200/people_v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86c6'
-d '
{
size: 9,
sort:
{
score:
{
order: "desc"
}
},
query:
{
filtered:
{
query:
{
query_string: { query:
"userid:9b3ac7e80c7968b8ca1d6e56f96a86c6" }
},
filter:
{
and:[
{
or:[
{
term:
{
ngram: "as"
}
},
{
term:
{
company_ngram: "as"
}
},
{
query:
{
match:
{
nickname: "m"
}
}
}]
},
{
not:
{
term:
{

key:"9b3ac7e80c7968b8ca1d6e56f96a86c605d73158bcf9968d0d6bcd99db050f4b"
}
}
}]
}
}
}
}
}'

I've tried different mechanisms of searching, but still experience the
random slowness. For instance, this query exhibits the same behavior:

curl -XPOST
'http://localhost:9200/people_v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86c6'
-d '
{
size: 9,
query:
{
filtered:
{
query:
{
custom_score:
{
query:
{
term: { userid: "9b3ac7e80c7968b8ca1d6e56f96a86c6" }
},
script: "_source.score"
}
},
filter:
{
and:[
{
or:[
{
prefix:
{
namepart: "chuck"
}
},
{
prefix:
{
company: "chuck"
}
},
{
query:
{
match:
{
nickname: "chuck"
}
}
}]
},
{
not:
{
term:
{

key:"9b3ac7e80c7968b8ca1d6e56f96a86c605d73158bcf9968d0d6bcd99db050f4b"
}
}
}]
}
}
}
}
}'

Our results need to be sorted by a "score" that we pre-compute (the query
above attempts to use that score). Our searches use an edge ngram to
retrieve results up until 5 characters, and then we start using prefix
queries. Essentially searches are getting results based on name, nickname,
and company name, sorted by score.

Our results from "curl -XGET 'http://localhost:9200/_segments?pretty=true'"
show 17->30 search segments per shard. I've read some advice online that
running optimize may help, although I've read other areas that say not to
run optimize, it will get you into trouble, and let Lucene handle it.

Any advice on how we can debug this issue/improve performance?

Thanks,
-Mike

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hello Mike,

Sounds like warmershttp://www.elasticsearch.org/guide/reference/api/admin-indices-warmers/will
help with caching, if your load is generally low and you get slow
queries. This will raise the load when indexing, though.

Regarding optimizing, it's a bad idea if your indices are changing. Because
a merge (which will happen in subsequent indexing) will invalidate caches,
and if you have only one (or just a few) segments, you'll have a lot of
invalidated caches due to a merge. That's not the case if you have, for
example, time-based indices, and you know that yesterday's index will never
change. Then it should help to optimize, if you have I/O and CPU to spare.
And it seems you do :slight_smile:

You can try changing the merge
policyhttp://www.elasticsearch.org/guide/reference/index-modules/merge/,
if your index is changing. You can tune it for less segments. Although, if
you'll have very few segments, you'll run into the same issue as with
optimizing.

Another thing you can do is to look at what happens with your ES cluster
when such a query runs slowly. What are the caches doing? What's the GC
doing? What's CPU and I/O doing at the time? If you don't have a tool that
shows you this, you can see it with our
SPMhttp://sematext.com/spm/elasticsearch-performance-monitoring/
.

If a GC spike is a problem, you can try changing to the G1 GC. This helped
us at one pointhttp://blog.sematext.com/2013/06/24/g1-cms-java-garbage-collector/
.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Tue, Jul 30, 2013 at 7:07 PM, Mike mtheroux2@gmail.com wrote:

Hello,

We run an elastic search cluster that we've been experiencing random slow
queries on. Any help in debugging the source of the slow query would be
appreciated. Basically, we are getting occasional searches that are 1.5 ->
12 seconds. Most searches are much faster. The goal would be to return
search results in less than 1 second.

We run Elastic Search 0.90.2 that we've recently installed, and we
re-indexed all of our data (as opposed to upgrading the existing .19.11
systems). We have over 180,000,000 documents. We are configured to run 5
shards with 1 replica. Our systems are deployed in Amazon EC2 on 2
m1.xlarge instances. Both Elastic Search cluster members run with a heap
of 10Gig. CPU and I/O load on the systems are very low. Indexing
performance seems excellent. We run a refresh interval of 10 seconds.
Current disk usage for the data is under 140gig across both search
instances.

Here is our schema:

{
"cluster_name": "Open",
"master_node": "2MascUtjSCmSLyrGlxiBYg",
"blocks": {},
"nodes": {
"2MascUtjSCmSLyrGlxiBYg": {
"name": "Black Panther",
"transport_address": "inet[/10.0.4.51:9300]",
"attributes": {}
},
"HTGsGWe9QfaqdJbvzE9qaw": {
"name": "Darkdevil",
"transport_address": "inet[/10.0.8.51:9300]",
"attributes": {}
}
},
"metadata": {
"templates": {},
"indices": {
"people_v1": {
"state": "open",
"settings": {
"index.number_of_shards": "5",
"index.number_of_replicas": "1",
"index.version.created": "900299",
"index.analysis.filter.my_ngram.min_gram": "1",
"index.analysis.filter.my_metaphone.replace": "true",
"index.analysis.filter.my_metaphone.type": "phonetic",
"index.analysis.analyzer.metaphone_analyzer.tokenizer":
"standard",
"index.analysis.filter.nickname_filter.synonyms_path":
"analysis/names1.2.csv",
"index.analysis.filter.nickname_filter.type": "synonym",
"index.analysis.analyzer.ngram_analyzer.tokenizer":
"standard",
"index.analysis.filter.my_ngram.type": "edgeNGram",
"index.analysis.analyzer.email_analyzer.tokenizer":
"uax_url_email",
"index.analysis.analyzer.nickname_analyzer.tokenizer":
"standard",
"index.analysis.filter.my_metaphone.encoder":
"metaphone",
"index.analysis.analyzer.standard_analyzer.tokenizer":
"standard",
"index.analysis.analyzer.metaphone_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.metaphone_analyzer.filter.0":
"standard",
"index.analysis.filter.my_ngram.max_gram": "5",
"index.analysis.analyzer.nickname_analyzer.filter.0":
"standard",
"index.analysis.analyzer.domain_analyzer.tokenizer":
"uax_url_email",
"index.analysis.analyzer.ngram_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.ngram_analyzer.filter.0":
"standard",
"index.analysis.analyzer.nickname_analyzer.filter.2":
"nickname_filter",
"index.analysis.analyzer.nickname_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.standard_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.metaphone_analyzer.filter.2":
"my_metaphone",
"index.analysis.analyzer.standard_analyzer.filter.0":
"standard",
"index.analysis.analyzer.ngram_analyzer.filter.2":
"my_ngram",
"index.refresh_interval": "10s"
},
"mappings": {
"person": {
"_source": {
"compress": true
},
"_routing": {
"path": "userid",
"required": true
},
"properties": {
"email": {
"analyzer": "email_analyzer",
"type": "string"
},
"nameOf": {
"index": "no",
"type": "string"
},
"company": {
"type": "multi_field",
"fields": {
"company_ngram": {
"include_in_all": false,
"analyzer": "ngram_analyzer",
"type": "string"
},
"company": {
"analyzer": "standard",
"type": "string"
}
}
},
"userid": {
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs",
"type": "string"
},
"score": {
"type": "integer"
},
"domain": {
"analyzer": "domain_analyzer",
"type": "string"
},
"ownerid": {
"index": "no",
"type": "string"
},
"ownernameof": {
"index": "no",
"type": "string"
},
"key": {
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs",
"type": "string"
},
"namepart": {
"type": "multi_field",
"fields": {
"nickname": {
"include_in_all": false,
"analyzer": "nickname_analyzer",
"type": "string"
},
"metaphone": {
"include_in_all": false,
"analyzer": "metaphone_analyzer",
"boost": 0.8,
"type": "string"
},
"ngram": {
"include_in_all": false,
"analyzer": "ngram_analyzer",
"type": "string"
},
"namepart": {
"analyzer": "standard_analyzer",
"type": "string"
}
}
}
}
}
},
"aliases": []
}
}
},

We use routing in our implementation, and all the data being searched are
segmented by userid.

Our current searches are similar to the following:

curl -XPOST '
http://localhost:9200/people_v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86c6'
-d '
{
size: 9,
sort:
{
score:
{
order: "desc"
}
},
query:
{
filtered:
{
query:
{
query_string: { query:
"userid:9b3ac7e80c7968b8ca1d6e56f96a86c6" }
},
filter:
{
and:[
{
or:[
{
term:
{
ngram: "as"
}
},
{
term:
{
company_ngram: "as"
}
},
{
query:
{
match:
{
nickname: "m"
}
}
}]
},
{
not:
{
term:
{

key:"9b3ac7e80c7968b8ca1d6e56f96a86c605d73158bcf9968d0d6bcd99db050f4b"
}
}
}]
}
}
}
}
}'

I've tried different mechanisms of searching, but still experience the
random slowness. For instance, this query exhibits the same behavior:

curl -XPOST '
http://localhost:9200/people_v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86c6'
-d '
{
size: 9,
query:
{
filtered:
{
query:
{
custom_score:
{
query:
{
term: { userid: "9b3ac7e80c7968b8ca1d6e56f96a86c6" }
},
script: "_source.score"
}
},
filter:
{
and:[
{
or:[
{
prefix:
{
namepart: "chuck"
}
},
{
prefix:
{
company: "chuck"
}
},
{
query:
{
match:
{
nickname: "chuck"
}
}
}]
},
{
not:
{
term:
{

key:"9b3ac7e80c7968b8ca1d6e56f96a86c605d73158bcf9968d0d6bcd99db050f4b"
}
}
}]
}
}
}
}
}'

Our results need to be sorted by a "score" that we pre-compute (the query
above attempts to use that score). Our searches use an edge ngram to
retrieve results up until 5 characters, and then we start using prefix
queries. Essentially searches are getting results based on name, nickname,
and company name, sorted by score.

Our results from "curl -XGET 'http://localhost:9200/_segments?pretty=true'"
show 17->30 search segments per shard. I've read some advice online that
running optimize may help, although I've read other areas that say not to
run optimize, it will get you into trouble, and let Lucene handle it.

Any advice on how we can debug this issue/improve performance?

Thanks,
-Mike

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thank you for the response. I did notice the warmers API, but what is not
entirely clear, is what makes a good warmer? What types of queries should
be executed such that the cache is warmed sufficiently? We do sorting
within our queries, should the sorting be part of the warmer, and if so, it
wasn't clear from the API how to specify sort criteria.

Thanks,
-Mike

On Tuesday, July 30, 2013 12:35:56 PM UTC-4, Radu Gheorghe wrote:

Hello Mike,

Sounds like warmershttp://www.elasticsearch.org/guide/reference/api/admin-indices-warmers/will help with caching, if your load is generally low and you get slow
queries. This will raise the load when indexing, though.

Regarding optimizing, it's a bad idea if your indices are changing.
Because a merge (which will happen in subsequent indexing) will invalidate
caches, and if you have only one (or just a few) segments, you'll have a
lot of invalidated caches due to a merge. That's not the case if you have,
for example, time-based indices, and you know that yesterday's index will
never change. Then it should help to optimize, if you have I/O and CPU to
spare. And it seems you do :slight_smile:

You can try changing the merge policyhttp://www.elasticsearch.org/guide/reference/index-modules/merge/,
if your index is changing. You can tune it for less segments. Although, if
you'll have very few segments, you'll run into the same issue as with
optimizing.

Another thing you can do is to look at what happens with your ES cluster
when such a query runs slowly. What are the caches doing? What's the GC
doing? What's CPU and I/O doing at the time? If you don't have a tool that
shows you this, you can see it with our SPMhttp://sematext.com/spm/elasticsearch-performance-monitoring/
.

If a GC spike is a problem, you can try changing to the G1 GC. This helped
us at one pointhttp://blog.sematext.com/2013/06/24/g1-cms-java-garbage-collector/
.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Tue, Jul 30, 2013 at 7:07 PM, Mike <mthe...@gmail.com <javascript:>>wrote:

Hello,

We run an elastic search cluster that we've been experiencing random slow
queries on. Any help in debugging the source of the slow query would be
appreciated. Basically, we are getting occasional searches that are 1.5 ->
12 seconds. Most searches are much faster. The goal would be to return
search results in less than 1 second.

We run Elastic Search 0.90.2 that we've recently installed, and we
re-indexed all of our data (as opposed to upgrading the existing .19.11
systems). We have over 180,000,000 documents. We are configured to run 5
shards with 1 replica. Our systems are deployed in Amazon EC2 on 2
m1.xlarge instances. Both Elastic Search cluster members run with a heap
of 10Gig. CPU and I/O load on the systems are very low. Indexing
performance seems excellent. We run a refresh interval of 10 seconds.
Current disk usage for the data is under 140gig across both search
instances.

Here is our schema:

{
"cluster_name": "Open",
"master_node": "2MascUtjSCmSLyrGlxiBYg",
"blocks": {},
"nodes": {
"2MascUtjSCmSLyrGlxiBYg": {
"name": "Black Panther",
"transport_address": "inet[/10.0.4.51:9300]",
"attributes": {}
},
"HTGsGWe9QfaqdJbvzE9qaw": {
"name": "Darkdevil",
"transport_address": "inet[/10.0.8.51:9300]",
"attributes": {}
}
},
"metadata": {
"templates": {},
"indices": {
"people_v1": {
"state": "open",
"settings": {
"index.number_of_shards": "5",
"index.number_of_replicas": "1",
"index.version.created": "900299",
"index.analysis.filter.my_ngram.min_gram": "1",
"index.analysis.filter.my_metaphone.replace": "true",
"index.analysis.filter.my_metaphone.type": "phonetic",

"index.analysis.analyzer.metaphone_analyzer.tokenizer": "standard",
"index.analysis.filter.nickname_filter.synonyms_path":
"analysis/names1.2.csv",
"index.analysis.filter.nickname_filter.type":
"synonym",
"index.analysis.analyzer.ngram_analyzer.tokenizer":
"standard",
"index.analysis.filter.my_ngram.type": "edgeNGram",
"index.analysis.analyzer.email_analyzer.tokenizer":
"uax_url_email",
"index.analysis.analyzer.nickname_analyzer.tokenizer":
"standard",
"index.analysis.filter.my_metaphone.encoder":
"metaphone",
"index.analysis.analyzer.standard_analyzer.tokenizer":
"standard",
"index.analysis.analyzer.metaphone_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.metaphone_analyzer.filter.0":
"standard",
"index.analysis.filter.my_ngram.max_gram": "5",
"index.analysis.analyzer.nickname_analyzer.filter.0":
"standard",
"index.analysis.analyzer.domain_analyzer.tokenizer":
"uax_url_email",
"index.analysis.analyzer.ngram_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.ngram_analyzer.filter.0":
"standard",
"index.analysis.analyzer.nickname_analyzer.filter.2":
"nickname_filter",
"index.analysis.analyzer.nickname_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.standard_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.metaphone_analyzer.filter.2":
"my_metaphone",
"index.analysis.analyzer.standard_analyzer.filter.0":
"standard",
"index.analysis.analyzer.ngram_analyzer.filter.2":
"my_ngram",
"index.refresh_interval": "10s"
},
"mappings": {
"person": {
"_source": {
"compress": true
},
"_routing": {
"path": "userid",
"required": true
},
"properties": {
"email": {
"analyzer": "email_analyzer",
"type": "string"
},
"nameOf": {
"index": "no",
"type": "string"
},
"company": {
"type": "multi_field",
"fields": {
"company_ngram": {
"include_in_all": false,
"analyzer": "ngram_analyzer",
"type": "string"
},
"company": {
"analyzer": "standard",
"type": "string"
}
}
},
"userid": {
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs",
"type": "string"
},
"score": {
"type": "integer"
},
"domain": {
"analyzer": "domain_analyzer",
"type": "string"
},
"ownerid": {
"index": "no",
"type": "string"
},
"ownernameof": {
"index": "no",
"type": "string"
},
"key": {
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs",
"type": "string"
},
"namepart": {
"type": "multi_field",
"fields": {
"nickname": {
"include_in_all": false,
"analyzer": "nickname_analyzer",
"type": "string"
},
"metaphone": {
"include_in_all": false,
"analyzer": "metaphone_analyzer",
"boost": 0.8,
"type": "string"
},
"ngram": {
"include_in_all": false,
"analyzer": "ngram_analyzer",
"type": "string"
},
"namepart": {
"analyzer": "standard_analyzer",
"type": "string"
}
}
}
}
}
},
"aliases": []
}
}
},

We use routing in our implementation, and all the data being searched are
segmented by userid.

Our current searches are similar to the following:

curl -XPOST '
http://localhost:9200/people_v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86c6'
-d '
{
size: 9,
sort:
{
score:
{
order: "desc"
}
},
query:
{
filtered:
{
query:
{
query_string: { query:
"userid:9b3ac7e80c7968b8ca1d6e56f96a86c6" }
},
filter:
{
and:[
{
or:[
{
term:
{
ngram: "as"
}
},
{
term:
{
company_ngram: "as"
}
},
{
query:
{
match:
{
nickname: "m"
}
}
}]
},
{
not:
{
term:
{

key:"9b3ac7e80c7968b8ca1d6e56f96a86c605d73158bcf9968d0d6bcd99db050f4b"
}
}
}]
}
}
}
}
}'

I've tried different mechanisms of searching, but still experience the
random slowness. For instance, this query exhibits the same behavior:

curl -XPOST '
http://localhost:9200/people_v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86c6'
-d '
{
size: 9,
query:
{
filtered:
{
query:
{
custom_score:
{
query:
{
term: { userid: "9b3ac7e80c7968b8ca1d6e56f96a86c6"
}
},
script: "_source.score"
}
},
filter:
{
and:[
{
or:[
{
prefix:
{
namepart: "chuck"
}
},
{
prefix:
{
company: "chuck"
}
},
{
query:
{
match:
{
nickname: "chuck"
}
}
}]
},
{
not:
{
term:
{

key:"9b3ac7e80c7968b8ca1d6e56f96a86c605d73158bcf9968d0d6bcd99db050f4b"
}
}
}]
}
}
}
}
}'

Our results need to be sorted by a "score" that we pre-compute (the query
above attempts to use that score). Our searches use an edge ngram to
retrieve results up until 5 characters, and then we start using prefix
queries. Essentially searches are getting results based on name, nickname,
and company name, sorted by score.

Our results from "curl -XGET 'http://localhost:9200/_segments?pretty=true'"
show 17->30 search segments per shard. I've read some advice online that
running optimize may help, although I've read other areas that say not to
run optimize, it will get you into trouble, and let Lucene handle it.

Any advice on how we can debug this issue/improve performance?

Thanks,
-Mike

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thank you for the response. I did notice the warmers API, but what is not
entirely clear, is what makes a good warmer? What types of queries should
be executed such that the cache is sufficiently warmed? For instance, we
do sorting in our queries. Should the warmer sort? If so, it wasn't clear
to me from the API how this can be specified (there is a "query" field but
no "sort" field).

Thanks again,
-Mike

On Tuesday, July 30, 2013 12:35:56 PM UTC-4, Radu Gheorghe wrote:

Hello Mike,

Sounds like warmershttp://www.elasticsearch.org/guide/reference/api/admin-indices-warmers/will help with caching, if your load is generally low and you get slow
queries. This will raise the load when indexing, though.

Regarding optimizing, it's a bad idea if your indices are changing.
Because a merge (which will happen in subsequent indexing) will invalidate
caches, and if you have only one (or just a few) segments, you'll have a
lot of invalidated caches due to a merge. That's not the case if you have,
for example, time-based indices, and you know that yesterday's index will
never change. Then it should help to optimize, if you have I/O and CPU to
spare. And it seems you do :slight_smile:

You can try changing the merge policyhttp://www.elasticsearch.org/guide/reference/index-modules/merge/,
if your index is changing. You can tune it for less segments. Although, if
you'll have very few segments, you'll run into the same issue as with
optimizing.

Another thing you can do is to look at what happens with your ES cluster
when such a query runs slowly. What are the caches doing? What's the GC
doing? What's CPU and I/O doing at the time? If you don't have a tool that
shows you this, you can see it with our SPMhttp://sematext.com/spm/elasticsearch-performance-monitoring/
.

If a GC spike is a problem, you can try changing to the G1 GC. This helped
us at one pointhttp://blog.sematext.com/2013/06/24/g1-cms-java-garbage-collector/
.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Tue, Jul 30, 2013 at 7:07 PM, Mike <mthe...@gmail.com <javascript:>>wrote:

Hello,

We run an elastic search cluster that we've been experiencing random slow
queries on. Any help in debugging the source of the slow query would be
appreciated. Basically, we are getting occasional searches that are 1.5 ->
12 seconds. Most searches are much faster. The goal would be to return
search results in less than 1 second.

We run Elastic Search 0.90.2 that we've recently installed, and we
re-indexed all of our data (as opposed to upgrading the existing .19.11
systems). We have over 180,000,000 documents. We are configured to run 5
shards with 1 replica. Our systems are deployed in Amazon EC2 on 2
m1.xlarge instances. Both Elastic Search cluster members run with a heap
of 10Gig. CPU and I/O load on the systems are very low. Indexing
performance seems excellent. We run a refresh interval of 10 seconds.
Current disk usage for the data is under 140gig across both search
instances.

Here is our schema:

{
"cluster_name": "Open",
"master_node": "2MascUtjSCmSLyrGlxiBYg",
"blocks": {},
"nodes": {
"2MascUtjSCmSLyrGlxiBYg": {
"name": "Black Panther",
"transport_address": "inet[/10.0.4.51:9300]",
"attributes": {}
},
"HTGsGWe9QfaqdJbvzE9qaw": {
"name": "Darkdevil",
"transport_address": "inet[/10.0.8.51:9300]",
"attributes": {}
}
},
"metadata": {
"templates": {},
"indices": {
"people_v1": {
"state": "open",
"settings": {
"index.number_of_shards": "5",
"index.number_of_replicas": "1",
"index.version.created": "900299",
"index.analysis.filter.my_ngram.min_gram": "1",
"index.analysis.filter.my_metaphone.replace": "true",
"index.analysis.filter.my_metaphone.type": "phonetic",

"index.analysis.analyzer.metaphone_analyzer.tokenizer": "standard",
"index.analysis.filter.nickname_filter.synonyms_path":
"analysis/names1.2.csv",
"index.analysis.filter.nickname_filter.type":
"synonym",
"index.analysis.analyzer.ngram_analyzer.tokenizer":
"standard",
"index.analysis.filter.my_ngram.type": "edgeNGram",
"index.analysis.analyzer.email_analyzer.tokenizer":
"uax_url_email",
"index.analysis.analyzer.nickname_analyzer.tokenizer":
"standard",
"index.analysis.filter.my_metaphone.encoder":
"metaphone",
"index.analysis.analyzer.standard_analyzer.tokenizer":
"standard",
"index.analysis.analyzer.metaphone_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.metaphone_analyzer.filter.0":
"standard",
"index.analysis.filter.my_ngram.max_gram": "5",
"index.analysis.analyzer.nickname_analyzer.filter.0":
"standard",
"index.analysis.analyzer.domain_analyzer.tokenizer":
"uax_url_email",
"index.analysis.analyzer.ngram_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.ngram_analyzer.filter.0":
"standard",
"index.analysis.analyzer.nickname_analyzer.filter.2":
"nickname_filter",
"index.analysis.analyzer.nickname_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.standard_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.metaphone_analyzer.filter.2":
"my_metaphone",
"index.analysis.analyzer.standard_analyzer.filter.0":
"standard",
"index.analysis.analyzer.ngram_analyzer.filter.2":
"my_ngram",
"index.refresh_interval": "10s"
},
"mappings": {
"person": {
"_source": {
"compress": true
},
"_routing": {
"path": "userid",
"required": true
},
"properties": {
"email": {
"analyzer": "email_analyzer",
"type": "string"
},
"nameOf": {
"index": "no",
"type": "string"
},
"company": {
"type": "multi_field",
"fields": {
"company_ngram": {
"include_in_all": false,
"analyzer": "ngram_analyzer",
"type": "string"
},
"company": {
"analyzer": "standard",
"type": "string"
}
}
},
"userid": {
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs",
"type": "string"
},
"score": {
"type": "integer"
},
"domain": {
"analyzer": "domain_analyzer",
"type": "string"
},
"ownerid": {
"index": "no",
"type": "string"
},
"ownernameof": {
"index": "no",
"type": "string"
},
"key": {
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs",
"type": "string"
},
"namepart": {
"type": "multi_field",
"fields": {
"nickname": {
"include_in_all": false,
"analyzer": "nickname_analyzer",
"type": "string"
},
"metaphone": {
"include_in_all": false,
"analyzer": "metaphone_analyzer",
"boost": 0.8,
"type": "string"
},
"ngram": {
"include_in_all": false,
"analyzer": "ngram_analyzer",
"type": "string"
},
"namepart": {
"analyzer": "standard_analyzer",
"type": "string"
}
}
}
}
}
},
"aliases": []
}
}
},

We use routing in our implementation, and all the data being searched are
segmented by userid.

Our current searches are similar to the following:

curl -XPOST '
http://localhost:9200/people_v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86c6'
-d '
{
size: 9,
sort:
{
score:
{
order: "desc"
}
},
query:
{
filtered:
{
query:
{
query_string: { query:
"userid:9b3ac7e80c7968b8ca1d6e56f96a86c6" }
},
filter:
{
and:[
{
or:[
{
term:
{
ngram: "as"
}
},
{
term:
{
company_ngram: "as"
}
},
{
query:
{
match:
{
nickname: "m"
}
}
}]
},
{
not:
{
term:
{

key:"9b3ac7e80c7968b8ca1d6e56f96a86c605d73158bcf9968d0d6bcd99db050f4b"
}
}
}]
}
}
}
}
}'

I've tried different mechanisms of searching, but still experience the
random slowness. For instance, this query exhibits the same behavior:

curl -XPOST '
http://localhost:9200/people_v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86c6'
-d '
{
size: 9,
query:
{
filtered:
{
query:
{
custom_score:
{
query:
{
term: { userid: "9b3ac7e80c7968b8ca1d6e56f96a86c6"
}
},
script: "_source.score"
}
},
filter:
{
and:[
{
or:[
{
prefix:
{
namepart: "chuck"
}
},
{
prefix:
{
company: "chuck"
}
},
{
query:
{
match:
{
nickname: "chuck"
}
}
}]
},
{
not:
{
term:
{

key:"9b3ac7e80c7968b8ca1d6e56f96a86c605d73158bcf9968d0d6bcd99db050f4b"
}
}
}]
}
}
}
}
}'

Our results need to be sorted by a "score" that we pre-compute (the query
above attempts to use that score). Our searches use an edge ngram to
retrieve results up until 5 characters, and then we start using prefix
queries. Essentially searches are getting results based on name, nickname,
and company name, sorted by score.

Our results from "curl -XGET 'http://localhost:9200/_segments?pretty=true'"
show 17->30 search segments per shard. I've read some advice online that
running optimize may help, although I've read other areas that say not to
run optimize, it will get you into trouble, and let Lucene handle it.

Any advice on how we can debug this issue/improve performance?

Thanks,
-Mike

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hello Mike,

The point of a warmer is to run queries for you, and when the caches get
cold, the warmer will take the performance hit, not your customer's
queries. So I'd start with a typical query. Including sorting, because that
will warm up your field caches. And see how it goes from there.

Best regards,
Radu

On Tue, Jul 30, 2013 at 8:29 PM, Mike mtheroux2@gmail.com wrote:

Thank you for the response. I did notice the warmers API, but what is not
entirely clear, is what makes a good warmer? What types of queries should
be executed such that the cache is sufficiently warmed? For instance, we
do sorting in our queries. Should the warmer sort? If so, it wasn't clear
to me from the API how this can be specified (there is a "query" field but
no "sort" field).

Thanks again,
-Mike

On Tuesday, July 30, 2013 12:35:56 PM UTC-4, Radu Gheorghe wrote:

Hello Mike,

Sounds like warmershttp://www.elasticsearch.org/guide/reference/api/admin-indices-warmers/will help with caching, if your load is generally low and you get slow
queries. This will raise the load when indexing, though.

Regarding optimizing, it's a bad idea if your indices are changing.
Because a merge (which will happen in subsequent indexing) will invalidate
caches, and if you have only one (or just a few) segments, you'll have a
lot of invalidated caches due to a merge. That's not the case if you have,
for example, time-based indices, and you know that yesterday's index will
never change. Then it should help to optimize, if you have I/O and CPU to
spare. And it seems you do :slight_smile:

You can try changing the merge policyhttp://www.elasticsearch.org/guide/reference/index-modules/merge/,
if your index is changing. You can tune it for less segments. Although, if
you'll have very few segments, you'll run into the same issue as with
optimizing.

Another thing you can do is to look at what happens with your ES cluster
when such a query runs slowly. What are the caches doing? What's the GC
doing? What's CPU and I/O doing at the time? If you don't have a tool that
shows you this, you can see it with our SPMhttp://sematext.com/spm/elasticsearch-performance-monitoring/
.

If a GC spike is a problem, you can try changing to the G1 GC. This helped
us at one pointhttp://blog.sematext.com/2013/06/24/g1-cms-java-garbage-collector/
.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Tue, Jul 30, 2013 at 7:07 PM, Mike mthe...@gmail.com wrote:

Hello,

We run an elastic search cluster that we've been experiencing random
slow queries on. Any help in debugging the source of the slow query would
be appreciated. Basically, we are getting occasional searches that are 1.5
-> 12 seconds. Most searches are much faster. The goal would be to
return search results in less than 1 second.

We run Elastic Search 0.90.2 that we've recently installed, and we
re-indexed all of our data (as opposed to upgrading the existing .19.11
systems). We have over 180,000,000 documents. We are configured to run 5
shards with 1 replica. Our systems are deployed in Amazon EC2 on 2
m1.xlarge instances. Both Elastic Search cluster members run with a heap
of 10Gig. CPU and I/O load on the systems are very low. Indexing
performance seems excellent. We run a refresh interval of 10 seconds.
Current disk usage for the data is under 140gig across both search
instances.

Here is our schema:

{
"cluster_name": "Open",
"master_node": "2MascUtjSCmSLyrGlxiBYg",
"blocks": {},
"nodes": {
"2MascUtjSCmSLyrGlxiBYg": {
"name": "Black Panther",
"transport_address": "inet[/10.0.4.51:9300]",
"attributes": {}
},
"HTGsGWe9QfaqdJbvzE9qaw": {
"name": "Darkdevil",
"transport_address": "inet[/10.0.8.51:9300]",
"attributes": {}
}
},
"metadata": {
"templates": {},
"indices": {
"people_v1": {
"state": "open",
"settings": {
"index.number_of_shards": "5",
"index.number_of_replicas": "1",
"index.version.created": "900299",
"index.analysis.filter.my_**ngram.min_gram": "1",
"index.analysis.filter.my_**metaphone.replace":
"true",
"index.analysis.filter.my_metaphone.type":
"phonetic",
"index.analysis.analyzer.metaphone_analyzer.tokenizer":
"standard",
"index.analysis.filter.

nickname_filter.synonyms_path"
: "analysis/names1.2.csv",
"index.analysis.filter.**nickname_filter.type":
"synonym",
"index.analysis.analyzer.**ngram_analyzer.tokenizer":
"standard",
"index.analysis.filter.my_**ngram.type": "edgeNGram",
"index.analysis.analyzer.**email_analyzer.tokenizer":
"uax_url_email",
"index.analysis.analyzer.**nickname_analyzer.tokenizer":
"standard",
"index.analysis.filter.my_**metaphone.encoder":
"metaphone",
"index.analysis.analyzer.**standard_analyzer.tokenizer":
"standard",
"index.analysis.analyzer.**metaphone_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.**metaphone_analyzer.filter.0":
"standard",
"index.analysis.filter.my_**ngram.max_gram": "5",
"index.analysis.analyzer.**nickname_analyzer.filter.0":
"standard",
"index.analysis.analyzer.**domain_analyzer.tokenizer":
"uax_url_email",
"index.analysis.analyzer.**ngram_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.**ngram_analyzer.filter.0":
"standard",
"index.analysis.analyzer.**nickname_analyzer.filter.2":
"nickname_filter",
"index.analysis.analyzer.**nickname_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.**standard_analyzer.filter.1":
"lowercase",
"index.analysis.analyzer.**metaphone_analyzer.filter.2":
"my_metaphone",
"index.analysis.analyzer.**standard_analyzer.filter.0":
"standard",
"index.analysis.analyzer.**ngram_analyzer.filter.2":
"my_ngram",
"index.refresh_interval": "10s"
},
"mappings": {
"person": {
"_source": {
"compress": true
},
"_routing": {
"path": "userid",
"required": true
},
"properties": {
"email": {
"analyzer": "email_analyzer",
"type": "string"
},
"nameOf": {
"index": "no",
"type": "string"
},
"company": {
"type": "multi_field",
"fields": {
"company_ngram": {
"include_in_all": false,
"analyzer": "ngram_analyzer",
"type": "string"
},
"company": {
"analyzer": "standard",
"type": "string"
}
}
},
"userid": {
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs",
"type": "string"
},
"score": {
"type": "integer"
},
"domain": {
"analyzer": "domain_analyzer",
"type": "string"
},
"ownerid": {
"index": "no",
"type": "string"
},
"ownernameof": {
"index": "no",
"type": "string"
},
"key": {
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs",
"type": "string"
},
"namepart": {
"type": "multi_field",
"fields": {
"nickname": {
"include_in_all": false,
"analyzer": "nickname_analyzer",
"type": "string"
},
"metaphone": {
"include_in_all": false,
"analyzer": "metaphone_analyzer",
"boost": 0.8,
"type": "string"
},
"ngram": {
"include_in_all": false,
"analyzer": "ngram_analyzer",
"type": "string"
},
"namepart": {
"analyzer": "standard_analyzer",
"type": "string"
}
}
}
}
}
},
"aliases": []
}
}
},

We use routing in our implementation, and all the data being searched
are segmented by userid.

Our current searches are similar to the following:

curl -XPOST 'http://localhost:9200/people_**
v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86
c6http://localhost:9200/people_v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86c6'
-d '
{
size: 9,
sort:
{
score:
{
order: "desc"
}
},
query:
{
filtered:
{
query:
{
query_string: { query: "userid:

9b3ac7e80c7968b8ca1d6e56f96a86
c6" }
},
filter:
{
and:[
{
or:[
{
term:
{
ngram: "as"
}
},
{
term:
{
company_ngram: "as"
}
},
{
query:
{
match:
{
nickname: "m"
}
}
}]
},
{
not:
{
term:
{
key:"9b3ac7e80c7968b8ca1d6e56f96a86
c605d73158bcf9968d0d6bcd99db05
0f4b"
}
}
}]
}
}
}
}
}'

I've tried different mechanisms of searching, but still experience the
random slowness. For instance, this query exhibits the same behavior:

curl -XPOST 'http://localhost:9200/people_**
v1/person/_search?pretty=true&**routing=**9b3ac7e80c7968b8ca1d6e56f96a86
**c6http://localhost:9200/people_v1/person/_search?pretty=true&routing=9b3ac7e80c7968b8ca1d6e56f96a86c6'
-d '
{
size: 9,
query:
{
filtered:
{
query:
{
custom_score:
{
query:
{
term: { userid: "**9b3ac7e80c7968b8ca1d6e56f96a86
c6" }
},
script: "_source.score"
}
},
filter:
{
and:[
{
or:[
{
prefix:
{
namepart: "chuck"
}
},
{
prefix:
{
company: "chuck"
}
},
{
query:
{
match:
{
nickname: "chuck"
}
}
}]
},
{
not:
{
term:
{
key:"9b3ac7e80c7968b8ca1d6e56f96a86
c605d73158bcf9968d0d6bcd99db05
0f4b"
}
}
}]
}
}
}
}
}'

Our results need to be sorted by a "score" that we pre-compute (the
query above attempts to use that score). Our searches use an edge ngram to
retrieve results up until 5 characters, and then we start using prefix
queries. Essentially searches are getting results based on name, nickname,
and company name, sorted by score.

Our results from "curl -XGET 'http://localhost:9200/_**
segments?pretty=true http://localhost:9200/_segments?pretty=true'"
show 17->30 search segments per shard. I've read some advice online that
running optimize may help, although I've read other areas that say not to
run optimize, it will get you into trouble, and let Lucene handle it.

Any advice on how we can debug this issue/improve performance?

Thanks,
-Mike

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.