Hi Otis,
thanks for the hint.
Do you know a API which returns info about the Field Cache status?
I tried things like:
#curl -XGET 'http://localhost:9200/_status' > status.log #grep field
status.log
#curl localhost:9200/_stats > stats.log #grep field stats.log
Or if you have other strategies you can improve performance of a query
with 50 term facets using the API or the SPM at sematext?
On May 3, 8:00 am, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:
Hi Ridvan,
I don't think "field cache loading" (time?) is captured anywhere. If it
is, I'd love to know.
Like Shay said, FC is loaded with field values when you first facet or sort
on a given field. If you facet on field X, when you do it for the first
time, all values from X will be loaded into FC. If you then later facet or
sort on field Y, all values from Y will be loaded into FC when you do that.Otis
Performance Monitoring for Solr / Elasticsearch / HBase -Sematext Monitoring | Infrastructure Monitoring Service
On Wednesday, May 2, 2012 2:27:30 PM UTC-4, Ridvan Gyundogan wrote:
Which API should I use for the measurements of the "field cache
loading"?
On May 2, 7:42 pm, Shay Banon kim...@gmail.com wrote:
On Wed, May 2, 2012 at 1:13 PM, Ridvan Gyundogan ridva...@gmail.com
wrote:Hi Shay,
Is the times you mentioned happen when
data is not being indexed, and if not, can you check (I want to get
the
field cache loading out of the picture for a sec)?
I don't know how to check those, can I check with the spm monitor of
sematext, or is there other way to do this?Just execute the query several times, then the facet data will be in
memory. Only once you do that, take measurements.Thanks for the help and Kind Regards,
RidvanOn Apr 29, 8:07 pm, Shay Banon kim...@gmail.com wrote:
Those are quite a lot of facets. Is the times you mentioned happen
when
data is not being indexed, and if not, can you check (I want to get
the
field cache loading out of the picture for a sec)?Also, one thing that I would do, if you have a dashboard like
system, is
to
simply use AJAX and multiple search requests, one for each facet,
and
display the results for each specific search/facet as them come.
Lets see
how this helps things. You can test the same search request with
just one
facet and see how long it takes.Also, I find it strange that you got such a search perf improvement
when
compressing the transport, you have a 1gb link. Are the facets big?
(you
do
get 50 from each one, still strange though...).On Fri, Apr 27, 2012 at 11:59 PM, Ridvan Gyundogan <
ridva...@gmail.com
wrote:Hi Shay, here is the full query, some attribute names changed to
f1...fn.
Query on the whole index · GitHubI the meantime we noticed that if we remove the term_stats facet
on
the userId field we get some 30% improvement. The userId has 1 mln
different test values, but only one per document.Kind Regards
On Apr 27, 1:18 pm, Shay Banon kim...@gmail.com wrote:
Can you share the full search request you execute?
On Thu, Apr 26, 2012 at 11:52 AM, Ridvan Gyundogan <
ridva...@gmail.com
wrote:Hi Group,
I've read all the info in the net about performance tunning of
elasticsearch, but still not satisfied from the query
execution
time
of our index.
We have the following:
Hardware:
- 2 bare metal AMD machines, each 6 core 3Ghz, one 16GB the
other
32GB
RAM- 1GB network hardware, at least 100MB is supported.
Elasticsearch:
- version 19.1
- 8GB RAM dedicated to elasticsearch. ulimit -n 100000
ulimit -l
unlimited ES_HEAP_SIZE=8g, bootstrap.mlockall: true. The
memory is
locked my the elasticsearch. I check this by #cat
/proc//status
| grep VmLck - result is: "VmLck: 8864712 kB"- 2 shards - 1 shard on each server
- no replicas
Documents:
- 10mln documents - average size 2 kb
- each document has 30 string, not_analyzed, not stored
fields.- average field size - 30 chars
- 5 fields are String arrays - size average 10
- _all field is disabled
- _source.compress : true
Query:
- 30 Facets, no facet filters, start:0 size:10
Query Execution time: 5 - 9 sec after the first query.The only query execution time improvement we achieved was from
the
transport.tcp.compress: true option which gave us some 1.5
sec, it
was
6.5 - 10 sec before that.These times are still ok, it is for management reporting, but
I
really
hope to be able to improve them.
Anybody got better query time performance, and how?Kind Regards,
Ridvan