How to calculate the missing documents per minute

alerting

(krishna_gaddipati) #1

Hi Elastic Team, with the below query i was able to query the index and get the count of all documents per minute for a specified max and min time.
I am trying to integrate the below query into watcher script and calculate number of empty document's which is count(86) - sum(12) = 74.

GET dummyindex/_search?size=0
{
  "aggs": {
    "by_host": {
      "terms": {
        "field": "hostname.keyword"
      },
      "aggs": {
        "by_minute": {
          "date_histogram": {
            "field": "@timestamp",
            "interval": "minute",
            "missing": "0",
            "extended_bounds": {
              "max": "2018-05-15T09:55:00-07:00",
              "min": "2018-05-15T08:30:00-07:00"
            }
          }
        },
        "aggs": {
          "stats_bucket": {
            "buckets_path": "by_minute._count"
          }
        }
      }
    }
  }
}

output from above query
"aggregations": {
"by_host": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "host2",
"doc_count": 12,
"by_minute": { <-->}
"aggs": {
"count": 86,
"min": 0,
"max": 2,
"avg": 0.13953488372093023,
"sum": 12
}
},
{
"key": "host1",
"doc_count": 10,
"by_minute": {
"aggs": {
"count": 86,
"min": 0,
"max": 1,
"avg": 0.11627906976744186,
"sum": 10
}
}
]
}
}

Below is my watcher script. can you please suggest how to do this calculation's?

      POST _xpack/watcher/watch/downtimeexample
    {
     "trigger": {
       "schedule": {
         "interval": "10s"
       }
     },
     "input": {
       "search": {
         "request": {
           "indices": [
             "dummyindex"
           ],
           "body": {
             "size": 0,
               "query": {
                 "range": {
                   "@timestamp": {
                     "gte": "now-14d/d"
                   }
                 }
               },
               "aggs": {
                 "by_host": {
                   "terms": {
                     "field": "hostname.keyword"
                   },
                   "aggs": {
                     "by_minute": {
                       "date_histogram": {
                         "field": "@timestamp",
                         "interval": "minute",
                         "missing": "0",
                         "extended_bounds": {
                           "max": "2018-05-15T09:55:00-07:00",
                           "min": "2018-05-15T08:30:00-07:00"
                         }
                       }
                   },
                   "aggs": {
                     "stats_bucket": {
                       "buckets_path": "by_minute._count"
                     }
                   }
                 }
               }
           }
         }
       }
      }
     },
     "condition": {
      "compare": {
       "ctx.payload.hits.total": {
        "gte": 1
       }
      }
     },
     "transform": {
      "script": {
       "source": " def hosts = ctx.payload.aggregations.hostname_agg.buckets; return [   'empty_minute_data' : hosts.ctx.payload.aggregations.stats_bucket.count???? I have no idea where to start???, '@timestamp': ctx.trigger.scheduled_time];"
      }
     },   
     "actions": {
      "my_index_action": {
       "index": {
        "index": "summary-index-host-availability",
        "doc_type": "mytype"
       }
      }
     }
    }

(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.


(Alexander Reelsen) #3

Hey,

I think your assumption is wrong or I may have misunderstood it. The sum in the bucket is the sum of all the values encountered (which also equals the count multiplied by the average) - but this is not the number of documents which do not have a timestamp set.

Can you elaborate in more detail what you are after (don't explain this based on the watch, but what data you would like to extract, ignoring your data structure and the watch).

Thanks.

--alex