Is a way : to take on acount missing value buckets with Rollup search


(Jacomy) #1

Hi,

I am using the Rollup plugin 6.3.2 with es.
In my "source index", some field values may be null or missing.

Using elasticsearch request : on the "rollup index" there is no problem using

  • "missing": "missing" in term aggregation like :
GET index_rollup/_search
{
  "aggs": {
    "2": {
      "date_histogram": {
        "field": "@timestamp.date_histogram.timestamp",
        "interval": "1d",
        "time_zone": "Etc/UTC",
        "min_doc_count": 1
      },
      "aggs": {
        "3": {
          "terms": {
            "field": "data.client.terms.value",
            "size": 100,
            "order": {
              "1": "desc"
            },
            **"missing": "__missing__"**
          },
          "aggs": {
            "1": {
              "sum": {
                "field": "data.client.terms._count"
              }
            },
            "4": {
              "terms": {
                "field": "data.method.terms.value",
                "size": 97,
                "order": {
                  "1": "desc"
                },
                **"missing": "__missing__"**
              },
              "aggs": {
                "1": {
                  "sum": {
                    "field": "data.client.terms._count"
                  }
                }
          }
      }

................................

Result: 
  "aggregations": {
    "2": {
      "buckets": [
        {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "1": {
                  "value": 3082196
                },
                "4": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                    {
                      "1": {
                        "value": 2252323
                      },
                      "5": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                          {
                            "1": {
                              "value": 2252323
                            },
                            "6": {
                              "doc_count_error_upper_bound": 0,
                              "sum_other_doc_count": 0,
                              "buckets": [
                                {
                                  "1": {
                                    "value": 2252323
                                  },
                                  "7": {
                                    "doc_count_error_upper_bound": 0,
                                    "sum_other_doc_count": 0,
                                    "buckets": [
                                      {
                                        "1": {
                                          "value": 2252323
                                        },
                                        **"key": "__missing__",**
                                        "doc_count": 1
                                      }
                                    ]
                                  },
                                  **"key": "__missing__",**
                                  "doc_count": 1
                                }
                              ]
                            },
                            **"key": "__missing__",**
                            "doc_count": 1
                          }
                        ]
                      },
                      **"key": "__missing__",**
                      "doc_count": 1
                    }

................................................

But using Rollup Search : "missing": "missing" is correct at parsing level (no error), but at execution level :

  • List item

buckets with missing or empty values are empty
as with normal elasticsearch request they taken on account

GET index-rollup/_rollup_search
{
  "aggs": {
    "2": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "1d"
      },
      "aggs": {
        "3": {
          "terms": {
            "field": "data.client.terms.value",
            "size": 100000000,
            "missing": "__missing__"
          },
          "aggs": {
             "1": {
              "sum": {
                "field": "data.payloadsize"
              }
            },
            "4": {
              "terms": {
                "field": "data.method",
                "size": 100000000,
                "missing": "__missing__"
              },
              "aggs": {
                "5": {
                  "terms": {
                    "field": "data.submethod",
                    "size": 1000000000,
                     "missing": "__missing__"
                  }
            }

...........................................


  "aggregations": {
    "2": {
      "meta": {},
      "buckets": [
        {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "4": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  **"buckets": []**
                },
                "key": "POST",
                "doc_count": 791573
              }
            ]
          }

.......................

Is a way : to take on acount missing value buckets with Rollup search ?


(Zachary Tong) #2

Sorry for the delay, this slipped through my inbox.

It may be possible to support missing, although I need to do some thinking about it. The missing functionality is a little tricky due to how it's implemented internally.

But regardless, we should not allow a "bad" request to run. E.g. either missing should be supported, or we should throw an exception saying it's unsupported at the moment. I'll open a ticket to address this point in a minute, thanks for raising it to my attention!


(Jacomy) #3

Hi,
Thank's for your reply.

I have solved the problem with a template for my rollup index.
I force the concerned fields to NULL value , like this:

{
"index_patterns": ["pfs_pnsapi-rollup*"],
"settings": {
"number_of_shards": 1
},
"mappings": {
"_doc": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword"
}
}
},
{
"date_histograms": {
"path_match": "*.date_histogram.timestamp",
"mapping": {
"type": "date"
}
}
}
],
"properties": {
.............................
"q_client": {
"properties": {
"terms": {
"properties": {
"_count": {
"type": "long"
},
"value": {
"type": "keyword",
"null_value": "NULL"
}
}
}
}
},
"q_method": {
"properties": {
"terms": {
"properties": {
"_count": {
"type": "long"
},
"value": {
"type": "keyword",
"null_value": "NULL"
}
}
}
}
},
....................................
and it seems to be correct with the rollup_search


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.