How to calculate the missing documents per minute

saikrishnagaddipati · May 17, 2018, 8:24pm

Hi Elastic Team, with the below query i was able to query the index and get the count of all documents per minute for a specified max and min time.
I am trying to integrate the below query into watcher script and calculate number of empty document's which is count(86) - sum(12) = 74.

GET dummyindex/_search?size=0
{
  "aggs": {
    "by_host": {
      "terms": {
        "field": "hostname.keyword"
      },
      "aggs": {
        "by_minute": {
          "date_histogram": {
            "field": "@timestamp",
            "interval": "minute",
            "missing": "0",
            "extended_bounds": {
              "max": "2018-05-15T09:55:00-07:00",
              "min": "2018-05-15T08:30:00-07:00"
            }
          }
        },
        "aggs": {
          "stats_bucket": {
            "buckets_path": "by_minute._count"
          }
        }
      }
    }
  }
}

output from above query
"aggregations": {
"by_host": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "host2",
"doc_count": 12,
"by_minute": { <-->}
"aggs": {
"count": 86,
"min": 0,
"max": 2,
"avg": 0.13953488372093023,
"sum": 12
}
},
{
"key": "host1",
"doc_count": 10,
"by_minute": {
"aggs": {
"count": 86,
"min": 0,
"max": 1,
"avg": 0.11627906976744186,
"sum": 10
}
}
]
}
}

Below is my watcher script. can you please suggest how to do this calculation's?

      POST _xpack/watcher/watch/downtimeexample
    {
     "trigger": {
       "schedule": {
         "interval": "10s"
       }
     },
     "input": {
       "search": {
         "request": {
           "indices": [
             "dummyindex"
           ],
           "body": {
             "size": 0,
               "query": {
                 "range": {
                   "@timestamp": {
                     "gte": "now-14d/d"
                   }
                 }
               },
               "aggs": {
                 "by_host": {
                   "terms": {
                     "field": "hostname.keyword"
                   },
                   "aggs": {
                     "by_minute": {
                       "date_histogram": {
                         "field": "@timestamp",
                         "interval": "minute",
                         "missing": "0",
                         "extended_bounds": {
                           "max": "2018-05-15T09:55:00-07:00",
                           "min": "2018-05-15T08:30:00-07:00"
                         }
                       }
                   },
                   "aggs": {
                     "stats_bucket": {
                       "buckets_path": "by_minute._count"
                     }
                   }
                 }
               }
           }
         }
       }
      }
     },
     "condition": {
      "compare": {
       "ctx.payload.hits.total": {
        "gte": 1
       }
      }
     },
     "transform": {
      "script": {
       "source": " def hosts = ctx.payload.aggregations.hostname_agg.buckets; return [   'empty_minute_data' : hosts.ctx.payload.aggregations.stats_bucket.count???? I have no idea where to start???, '@timestamp': ctx.trigger.scheduled_time];"
      }
     },   
     "actions": {
      "my_index_action": {
       "index": {
        "index": "summary-index-host-availability",
        "doc_type": "mytype"
       }
      }
     }
    }

system · June 14, 2018, 8:24pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

spinscale · July 3, 2018, 9:12am

Hey,

I think your assumption is wrong or I may have misunderstood it. The sum in the bucket is the sum of all the values encountered (which also equals the count multiplied by the average) - but this is not the number of documents which do not have a timestamp set.

Can you elaborate in more detail what you are after (don't explain this based on the watch, but what data you would like to extract, ignoring your data structure and the watch).

Thanks.

--alex

Topic		Replies	Views
Get a number und events per minute Elasticsearch eql-elastic-query-language	4	3082	December 17, 2021
Pipelined bucket aggregation Elasticsearch ingest-pipeline	1	216	May 12, 2023
Alert - no documents in elasticsearch for a given time Elastic Observability elastic-stack-alerting	3	553	November 4, 2022
Get times where document count = 0 Elasticsearch	3	490	March 27, 2018
Questions about aggregation min_doc_count = 0 Elasticsearch	3	1871	July 6, 2017

How to calculate the missing documents per minute

Related topics