Hi everyone,
I'm new to Elasticsearch, so please be forgiving if I'm missing something
obvious. I'm comparing it with Solr for performance in a job which will
require large Boolean filters. However, my initial tests with trivial
filters have had very poor performance. The index size is 25M records
(65GB, un-optimised). For comparison, a similar Solr index is 102GB.
The following request takes about 300ms uncached, which reduces to about
200ms after repeated requests:
$ curl localhost:9200/core/_search -d'{"filter":{
"term":{"publication_acronym":"EEN"}},
"size":0}
'
{"took":182,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":246929,"max_score":1.0,"hits":[]}}
A similar Solr search takes about 900ms at first, then reduces to 2-3ms. So
I'm guessing that ES is not caching the filter effectively(?) I'm giving
both ES and Solr 16GB heap (out of 24GB total).
I'll paste my "core" setup at the end of this message. Does anyone have any
idea of what might be wrong, and how I can fix it?
Many thanks,
Tom
{
"settings" : {
"number_of_shards" : 1
},
"analysis" : {
"analyzer" : {
"standard" : { "type" : "standard" },
"english" : { "type" : "english" }
}
},
"mappings" : {
"article" : {
"properties" : {
"id" : { "type" : "string", "index" : "not_analyzed" },
"publication_name" : { "type" : "string", "index" :
"not_analyzed" },
"publication_acronym" : { "type" : "string", "index" :
"not_analyzed" },
"publication_subsource" : { "type" : "string", "index" :
"not_analyzed" },
"edition" : { "type" : "string", "index" : "not_analyzed" },
"region" : { "type" : "string", "index" : "not_analyzed" },
"day" : { "type" : "string", "index" : "not_analyzed" },
"page_section" : { "type" : "string", "index" :
"not_analyzed" },
"author" : { "type" : "string", "index" : "not_analyzed" },
"author_t" : { "type" : "string", "analyzer" : "standard" },
"headline" : { "type" : "string", "analyzer" : "standard" },
"subheadline" : { "type" : "string", "analyzer" :
"standard" },
"byline" : { "type" : "string", "analyzer" : "standard" },
"caption" : { "type" : "string", "analyzer" : "standard" },
"spellcheck" : { "type" : "string", "analyzer" : "standard"
},
"body" : { "type" : "string", "analyzer" : "english" },
"para1" : { "type" : "string", "analyzer" : "english" },
"wordcount" : { "type" : "integer" },
"restriction" : { "type" : "integer" },
"page_numbers" : { "type" : "integer" },
"publication_date" : { "type" : "date" },
"in_last_edition" : { "type" : "boolean" }
}
}
}
}
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.