Distributed Frequency Search

According to
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/relevance-is-broken.html,
the relevance is broken until we have enough data distributed uniformly
across shards.

My question is: If I initially use the ?search_type=dfs_query_then_fetch
parameter because I few data, will it affect the performance when the
Production environment will have enough data sharded uniformly?

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/50856f05-4329-4001-a833-6c568fc5730f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Answering myself:

According to ES blog
http://www.elasticsearch.org/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch/
there is performance hit. It would be nice to have a feature that triggers
automatically DFS based on a kinda threshold...

On Wednesday, November 5, 2014 2:44:14 PM UTC+1, Sofiane Cherchalli wrote:

According to
Elasticsearch Platform — Find real-time answers at scale | Elastic,
the relevance is broken until we have enough data distributed uniformly
across shards.

My question is: If I initially use the ?search_type=dfs_query_then_fetch
parameter because I few data, will it affect the performance when the
Production environment will have enough data sharded uniformly?

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2d535fce-8df3-4e13-9259-b017a11ac634%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

We did some performance testing and found that the performance hit from
using DFS was minor.

--
Ivan

On Wed, Nov 5, 2014 at 8:55 AM, Sofiane Cherchalli sofianito@gmail.com
wrote:

Answering myself:

According to ES blog
http://www.elasticsearch.org/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch/
there is performance hit. It would be nice to have a feature that triggers
automatically DFS based on a kinda threshold...

On Wednesday, November 5, 2014 2:44:14 PM UTC+1, Sofiane Cherchalli wrote:

According to Elasticsearch Platform — Find real-time answers at scale | Elastic
current/relevance-is-broken.html, the relevance is broken until we have
enough data distributed uniformly across shards.

My question is: If I initially use the ?search_type=dfs_query_then_fetch
parameter because I few data, will it affect the performance when the
Production environment will have enough data sharded uniformly?

Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2d535fce-8df3-4e13-9259-b017a11ac634%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2d535fce-8df3-4e13-9259-b017a11ac634%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCLCajraVb2VkY%3DsTLofJhYp1OCSz2QtkA603JJfrHgHw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.