Hi,
As far as search type goes, the example I showed up was just using curl...
so... whatever Elasticsearch defaults to. We haven't explicitly defined any
default search type, so I don't know what search type it's using.
I should have included a example of the data... it's pretty small. Here's
an example:
---snip---
{
"_index" : "ideas",
"_type" : "idea",
"_id" : "v6nZyBjnTOmoY8ocFgYUcw",
"_score" : 12.482641, "_source" :
{"networks":38777,"creator_last_name":"Tan","definition":"Example of an
archeological site (Neolithic origination, finished in the Bronze Age);
located in
England","time_created":"2012-05-22T00:25:05.090Z","network_name":"Blah
blah
Blahl","original_card_id":81897377,"frequency":1,"families":293466,"creator_id":1112643,"creator_first_name":"Howard","authors":1112643,"idea_signature":"-23795688954730064,-3778677691914660","num_views":0,"time_updated":"2012-05-22T00:25:05.107Z","network_id":38777,"term":"Stonehenge","original_document_id":3018120,"term_signature":"-23795688954730064","media":"http://example.com/images/stonehenge1334597688387.jpg","image_fill":"1"}
}
---snip---
No highlighting.
-Sean
On Thursday, October 18, 2012 12:53:31 AM UTC-7, simonw wrote:
Hey Sean,
On similar indices this query returns in like 10ms or less. Can you give
me more information how much data you are returning (the fields), are you
doing highlighting etc? I also wonder how big you documents are and what
search type are you running ie QueryThenFetch ?
simon
On Thursday, October 18, 2012 2:06:53 AM UTC+2, VegHead wrote:
We're seeing query performance that is surprisingly slow. Running 10
nodes with 0.19.10 on AWS EC2. Each instance is an m1.xlarge with 15GB of
RAM and 8x 60GB EBS volumes in RAID-0 for data storage.
ES_MIN_MEM=8192M
ES_MAX_MEM=14000M
We have 8 indexes with a grand total of roughly 110GB of data. Each index
has 20 shards and 3 replicas. The largest index is roughly 55GB and has 35
million records. Queries against this index typically take over 1.5 seconds:
---snip---
time curl -XGET
http://es.dev.example.com:9200/indexname/_search?pretty=true -d '{
"query" : {
"match" : {
"term" : {
"query" : "Stonehenge"
}
}
}
}'
{
"took" : 1385,
"timed_out" : false,
"_shards" : {
"total" : 20,
"successful" : 20,
"failed" : 0
},
"hits" : {
"total" : 545,
...
}
real 0m1.676s
user 0m0.011s
sys 0m0.006s
---snip---
In this case case, "term" is a stored string value. System load is low -
very little CPU utilizations and light disk I/O. I was really hoping we
would see simple queries take less than 250ms, so I'm pretty shocked by the
performance.
What things should we look at to determine why the queries are slow? Or
any strategies to improve performance?
Any advice would be greatly appreciated.
-Sean
--