How dfs_query_then_fetch works?

Hi guys

As per my understanding dfs_query_then_fetch is used to get accurate results over multiple shards.

Term aggregation returns the approximate count per term(still confused with how it calculates documents errors) .

If i use dfs_query_then_fetch search with term aggregation will it return the accurate results? I tried but it is not. I am still getting doc_count_error_upper_bound value greater than zero.

QUERY:

GET demo/_search?pretty=true&search_type=dfs_query_then_fetch
{
    "aggs" : {
        "products" : {
            "terms" : {
                "field" : "demo1.keyword",
                "size" : 5
            }
        }
    }
}

Result:

"aggregations": {
    "products": {
      "doc_count_error_upper_bound": 10959,
      "sum_other_doc_count": 10454016,
      "buckets": [
        {
          "key": "SK148",
          "doc_count": 4442
        },
        {
          "key": "SK67",
          "doc_count": 4432
        },
        {
          "key": "SK489",
          "doc_count": 4420
        },
        {
          "key": "SK592",
          "doc_count": 2245
        },
        {
          "key": "SK88",
          "doc_count": 2245
        }
      ]
    }
  }

I think i misconception here.

Please guide me on this. It will be great help.

I believe dfs_query_then_fetch calculates terms' IDFs globally without
per-shard skew.

Yeah. I am assuming this means it will calculate count on whole data rather than on shard level & i should get accurate count without any doc_count_error_upper_bound or zero.

Am i right?

No. I believe dfs_query_then_fetch impacts only scoring (search), but not
aggregations.

I am not sure I understand. Based on this:

dfs_query_then_fetch should return result skipping the first phase of query_then_fetch. The documentation does not say that it does not impact aggregations.

This issue is about incorrect explanation. Which is the a separate thing.

Madhusudan_Atmakuri https://discuss.elastic.co/u/madhusudan_atmakuri
January 11

I am not sure I understand. Based on this:
GitHub - elastic/elasticsearch: Free and Open, Distributed, RESTful Search Engine
https://github.com/elastic/elasticsearch/issues/20580
https://github.com/XGiton Issue: dfs_query_then_fetch does work
sometimes https://github.com/elastic/elasticsearch/issues/20580
opened by XGiton https://github.com/XGiton on 2016-09-20
https://github.com/elastic/elasticsearch/issues/20580
closed by jimczi https://github.com/jimczi on 2016-09-20
https://github.com/elastic/elasticsearch/issues/20580

Elasticsearch version: Elasticsearch 5.0.0-alpha5
Plugins installed: [analyzer-ik]
JVM version: jdk1.8.0_101
OS version: Ubuntu 14.04
Description of the problem including expected versus actual behavior:
dfs_query_then_fetch does not...

dfs_query_then_fetch should return result skipping the first phase of
query_then_fetch. The documentation does not say that it does not impact
aggregations.

The documentation doesn't say it does.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.