Official document question

The following is the indexed document

  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 11,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "id" : 10002,
          "name" : "李四",
          "age" : 20,
          "birthday" : "1990-06-20",
          "hobby" : [
            "唱歌",
            "跳舞"
          ]
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "id" : 10003,
          "name" : "王五",
          "age" : 22,
          "birthday" : "1898-02-15",
          "hobby" : [
            "篮球",
            "跳舞"
          ]
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "id" : 10004,
          "name" : "赵六",
          "age" : 16,
          "birthday" : "1994-09-09",
          "hobby" : [
            "羽毛球",
            "听音乐"
          ]
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "id" : 10005,
          "name" : "小明",
          "age" : 24,
          "birthday" : "1896-02-22",
          "hobby" : [
            "听音乐",
            "唱歌"
          ]
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "id" : 10006,
          "name" : "吴唱歌",
          "age" : 20,
          "birthday" : "1990-06-20",
          "hobby" : [
            "唱歌",
            "写作业"
          ]
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "_source" : {
          "id" : 10007,
          "name" : "elasticsearch",
          "age" : 18,
          "birthday" : "1992-08-10",
          "hobby" : [
            "编程",
            "打游戏"
          ]
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 1.0,
        "_source" : {
          "id" : 10008,
          "name" : "无名氏",
          "age" : null,
          "birthday" : null,
          "hobby" : null
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "id" : 10001,
          "name" : "张三",
          "age" : 18,
          "birthday" : "1992-08-10",
          "hobby" : [
            "篮球",
            "羽毛球"
          ]
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "9",
        "_score" : 1.0,
        "_source" : {
          "id" : 10009,
          "name" : "张三9",
          "age" : 20,
          "birthday" : "1990-03-15",
          "hobby" : [
            "写作业",
            "跑步"
          ]
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : 1.0,
        "_source" : {
          "id" : 100010,
          "name" : "李四10",
          "age" : 26,
          "birthday" : "1894-06-06",
          "hobby" : [
            "唱歌",
            "听音乐"
          ]
        }
      }
    ]
  }
}

When I write the example with reference to the official documentation, an error occurs

POST /student/_search?size=0
{
    "aggs" : {
        "avg_age" : {
            "avg" : {
                "script" : {
                    "source" : "doc.age.value"
                }
            }
        }
    }
}

Error is as follows:

  "error": {
    "root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "org.elasticsearch.index.fielddata.ScriptDocValues$Longs.get(ScriptDocValues.java:121)",
          "org.elasticsearch.index.fielddata.ScriptDocValues$Longs.getValue(ScriptDocValues.java:115)",
          "doc.age.value",
          "       ^---- HERE"
        ],
        "script": "doc.age.value",
        "lang": "painless"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "student",
        "node": "2k4JGwMnRBaEOGOgHY3gZg",
        "reason": {
          "type": "script_exception",
          "reason": "runtime error",
          "script_stack": [
            "org.elasticsearch.index.fielddata.ScriptDocValues$Longs.get(ScriptDocValues.java:121)",
            "org.elasticsearch.index.fielddata.ScriptDocValues$Longs.getValue(ScriptDocValues.java:115)",
            "doc.age.value",
            "       ^---- HERE"
          ],
          "script": "doc.age.value",
          "lang": "painless",
          "caused_by": {
            "type": "illegal_state_exception",
            "reason": "A document doesn't have a value for a field! Use doc[<field>].size()==0 to check if a document is missing a field!"
          }
        }
      }
    ]
  },
  "status": 400
}

When I use the following way

{
    "aggs" : {
        "avg_age" : {
            "avg" : {
                "script" : {
                    "source" : "doc.age"
                }
            }
        }
    }
}

the resul is correct。Is the way I am working wrong or is the official document incorrect?

There are two issues to discuss here: missing values and multi-values.

I'll start with multi-values.

Suppose you indexed a document with an array of values for "age". (Of course this does not make sense in the context of your use case, but the field could easily be one that represents something where multiple values are logical.) The result of the agg with the script "doc.age" would be to iterate over all of the indexed values in each doc, so they each would influence the avg. The script "doc.age.value" would only query each doc once, and the script would return the first indexed value of the field.

Missing values, though, is what the error you describe is reporting. That is, there are documents included in the aggregation set that do not have any "age" value. As you demonstrate, when the script is "doc.age", the document is silently (from the user's perspective) ignored, and the average is calculated by

sum of values on docs in agg set/number of values on docs in agg set

where the script is "doc.age.value" will cause an error when a document is missing the field, but if no error occurs the the average is calculated by

sum of the first values on docs in agg set/number of docs in agg set

If you have guarantees that none of your documents are multi-valued, and you can accept the fact that some of your docs have no value for the field, and how the avg calculation is performed, your script should suffice.

What the error means by Use doc[<field>].size()==0 is that you could add a guard in your script like

"if (doc['age'].size()==0) { return null; } return doc.age.value;"
1 Like

Your answer is very useful,thank you very much, It is my fault