In the past when i used solr, i can look at time taken by each component to
understand where most of the time is spent for a particular query.
Similarly, I am trying to understand the breakdown of time spent for one
particular query. Can anyone tell me how can i investigate performance of
specific queries it in elasticsearch ?
That is not easy, and the reason is that Elasticsearch and Solr work in
quite a different way eg. when it comes to compute facets/aggregations:
Solr first computes top hits, and if facets are required, it will load the
doc IDs of document matches into a bit set that will be used in a
subsequent step in order to compute facets. On the other hand,
Elasticsearch computes both top hits and facets/aggregations at the same
time (in the same "Collector" if you are familiar with Lucene terminology)
which makes timings harder to track.
On Thu, May 22, 2014 at 6:25 AM, Srinivasan Ramaswamy ursvasan@gmail.comwrote:
In the past when i used solr, i can look at time taken by each component
to understand where most of the time is spent for a particular query.
Similarly, I am trying to understand the breakdown of time spent for one
particular query. Can anyone tell me how can i investigate performance of
specific queries it in elasticsearch ?
Thanks for the clarification. I suspected (as it was mentioned in ES site)
that significant terms feature is the bottle neck and verified the amount
of time the query takes with the feature turned on and off. With the
feature the query takes 5s (every time) and after turning it off the query
takes only 160ms.
Does anyone have any tips on how to make significant terms feature much
faster ? Does it require a lot of JVM heap size ? Currently I am running on
a virtual machine with 16G where i allocated 8G to JVM. I have 6 nodes with
a total of 24 shards and 1 replica (default). Each shard size is 2.5G. How
can I optimize the significant terms feature ?
That is not easy, and the reason is that Elasticsearch and Solr work in
quite a different way eg. when it comes to compute facets/aggregations:
Solr first computes top hits, and if facets are required, it will load the
doc IDs of document matches into a bit set that will be used in a
subsequent step in order to compute facets. On the other hand,
Elasticsearch computes both top hits and facets/aggregations at the same
time (in the same "Collector" if you are familiar with Lucene terminology)
which makes timings harder to track.
On Thu, May 22, 2014 at 6:25 AM, Srinivasan Ramaswamy ursvasan@gmail.comwrote:
In the past when i used solr, i can look at time taken by each component
to understand where most of the time is spent for a particular query.
Similarly, I am trying to understand the breakdown of time spent for one
particular query. Can anyone tell me how can i investigate performance of
specific queries it in elasticsearch ?
Thanks
Srini
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.