How dfs_query_then_fetch works?

ankur_singla · January 11, 2018, 3:31am

Hi guys

As per my understanding dfs_query_then_fetch is used to get accurate results over multiple shards.

Term aggregation returns the approximate count per term(still confused with how it calculates documents errors) .

If i use dfs_query_then_fetch search with term aggregation will it return the accurate results? I tried but it is not. I am still getting doc_count_error_upper_bound value greater than zero.

QUERY:

GET demo/_search?pretty=true&search_type=dfs_query_then_fetch
{
    "aggs" : {
        "products" : {
            "terms" : {
                "field" : "demo1.keyword",
                "size" : 5
            }
        }
    }
}

Result:

"aggregations": {
    "products": {
      "doc_count_error_upper_bound": 10959,
      "sum_other_doc_count": 10454016,
      "buckets": [
        {
          "key": "SK148",
          "doc_count": 4442
        },
        {
          "key": "SK67",
          "doc_count": 4432
        },
        {
          "key": "SK489",
          "doc_count": 4420
        },
        {
          "key": "SK592",
          "doc_count": 2245
        },
        {
          "key": "SK88",
          "doc_count": 2245
        }
      ]
    }
  }

I think i misconception here.

Please guide me on this. It will be great help.

Mikhail_Khludnev · January 11, 2018, 4:13am

I believe dfs_query_then_fetch calculates terms' IDFs globally without
per-shard skew.

ankur_singla · January 11, 2018, 8:16am

Yeah. I am assuming this means it will calculate count on whole data rather than on shard level & i should get accurate count without any doc_count_error_upper_bound or zero.

Am i right?

Mikhail_Khludnev · January 11, 2018, 1:37pm

No. I believe dfs_query_then_fetch impacts only scoring (search), but not
aggregations.

Madhusudan_Atmakuri · January 11, 2018, 5:19pm

I am not sure I understand. Based on this:

dfs_query_then_fetch should return result skipping the first phase of query_then_fetch. The documentation does not say that it does not impact aggregations.

Mikhail_Khludnev · January 12, 2018, 3:36am

This issue is about incorrect explanation. Which is the a separate thing.

Madhusudan_Atmakuri https://discuss.elastic.co/u/madhusudan_atmakuri
January 11

I am not sure I understand. Based on this:
GitHub - elastic/elasticsearch: Free and Open, Distributed, RESTful Search Engine
https://github.com/elastic/elasticsearch/issues/20580
https://github.com/XGiton Issue: dfs_query_then_fetch does work
sometimes https://github.com/elastic/elasticsearch/issues/20580
opened by XGiton https://github.com/XGiton on 2016-09-20
https://github.com/elastic/elasticsearch/issues/20580
closed by jimczi https://github.com/jimczi on 2016-09-20
https://github.com/elastic/elasticsearch/issues/20580

Elasticsearch version: Elasticsearch 5.0.0-alpha5
Plugins installed: [analyzer-ik]
JVM version: jdk1.8.0_101
OS version: Ubuntu 14.04
Description of the problem including expected versus actual behavior:
dfs_query_then_fetch does not...

dfs_query_then_fetch should return result skipping the first phase of
query_then_fetch. The documentation does not say that it does not impact
aggregations.

The documentation doesn't say it does.

system · February 9, 2018, 3:37am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Use dfs_query_then_fetch with aggregation Elasticsearch	6	850	February 20, 2018
"dfs_query_then_fetch" and "query_then_fetch" return the same score Elasticsearch	6	884	February 13, 2018
Specifying dfs_query_then_fetch does not work for nested queries Elasticsearch	2	971	July 5, 2017
Is there any way to avoid using dfs_query_then_fetch ? Elasticsearch	1	337	December 18, 2023
Is dfs_query_then_fetch relevant for BM25/ES 5.0? Elasticsearch	3	744	July 5, 2017

How dfs_query_then_fetch works?

Related topics