we tried using soft caches but it didn't matter as they were not actually
being invalidated. it was just for testing purposes, but we are not using
that anymore. we also don't have much of memory problems. we are running
the jvm with 30gb heap, so that's plenty for our needs at the moment.
Leonardo Menezes
(+34) 688907766
http://lmenezes.com
http://twitter.com/leonardomenezes
On Fri, Jan 25, 2013 at 11:06 AM, Jörg Prante joergprante@gmail.com wrote:
Hi,
interesting... it looks like your system can fit the 5000k documents into
the cache with "execution_hint: map" without being hit seriously by GC.
Without execution_hint:map, do you use soft refs by any chance? That would
explain the 600ms, could be extra time because your cache elements are
being invalidated.Jörg
Am 25.01.13 10:16, schrieb Leonardo Menezes:
So... just to give an update on this. Reading the source code last night,
We found a parameter that doesn't seem to be documented anywhere and that
is related to choosing which faceting method should be used for a certain
field. The parameter is called execution_hint and should be used like"facets" : {
"company" : {
"terms" : {
"field" : "current_company",
"size" : 15,
"execution_hint":"map"
}
}
}The process of choosing the faceting method occurs at TermsFacetProcessor
and is a bit different for strings than it is for other types. Anyway,
after running some tests with this setting, our response time improved a
LOT. So, some numbers:Index: 12MM documents
Field: string, multi valued. has about 400k unique value
Document: has between 1 to 10 values for this fieldQuery #1(matches 5000k documents)
- using "execution_hint":"map" - roughly 50ms avg.
- not using it - roughly 600ms avg.
Query #2(match all, so, 12MM documents)
- using "execution_hint":"map" - roughly 1.9s avg.
- not using it - roughly 800ms avg.
so, since our query pattern is really close to query #1, that really made
a big difference in our results. hope that might be of some help for
someone else.Leonardo Menezes
(+34) 688907766
http://lmenezes.com http://lmenezes.com/<http://twitter.com/**leonardomenezeshttp://twitter.com/leonardomenezes
On Thu, Jan 24, 2013 at 4:39 PM, Drew Raines <aaraines@gmail.com <mailto:
aaraines@gmail.com>> wrote:Ivan Brusic wrote: > Have you seen some of the latest commits? > > https://github.com/**elasticsearch/elasticsearch/**commit/**
346422b74751f498f037daff34ea13**6a131fca89https://github.com/elasticsearch/elasticsearch/commit/346422b74751f498f037daff34ea136a131fca89
>
> There are no issues attached to these commits, so there is no
> telling what version they belong to.The goal is for the fielddata refactoring and Lucene 4.1 integration to appear in 0.21.0. Much of the work is already in master. -Drew --