Getting count based on field value


(jdepp99) #1
{

"query": {
"filtered": {
"query": {
"query_string": {
"analyze_wildcard": true,
"query": ""
}
},
"filter": {
"bool": {
"must": [
{
"query": {
"match": {
"PStream": {
"query": "
",
"type": "phrase"
}
}
}
},
{
"range": {
"@timestamp": {
"gte": 1447777019722,
"lte": 1447780619722
}
}

I am trying to run a query that will return the total message count for the field PStream for any of the possible values. For example there are over 100 different PStream values, and I am setting up an alert system that will trigger an alert when the count = 0 for the last hour for any of the PStream values. Could someone assist or suggest a way to construct this query? Would really appreciate it.

Thanks


(Colin Goodheart-Smithe) #2

You could use a boolean query with a must_not clause which contains the missing query to query for all messages where PStream has a value (i.e. where the value is not missing).


(Vincent Biret) #3

Note : this approach only works if the field PSteam is missing from the document :

"query": {
"filtered": {
"query": {
"query_string": {
"analyze_wildcard": true,

"query": "_!exists_:PStream"

}
},

And you can delete the query match parameter


(jdepp99) #4

Thanks for the responses and suggestion. I apologize, I should have been a little more clearer. I am mainly worried about the counts of that value and not missing values as that field will always be occupied and will serve as the identifier.

GET /logstash-*/_search?search_type=count
{
 "query": {
"filtered": {
  "query": {
      "match":{
         "PStream":"864"   
      }
    }
  }
 }
}

Returns the following:

  "hits": {
    "total": 21657418,
    "max_score": 0,
    "hits": []
  }

this is more like what I need. I need to get the counts for all messages with PStream: * and if count = 0, then that creates an alert. The only thing I don't have correct yet, is defining a time window. Tried to add the following but getting parser error:

     {
     "query": {
     "filtered": {
       "query": {
          "match":{
           "PStream":"864"   
          }
      },  

 "aggs": {
   "by_day": {
   "date_histogram": {
      "field":     "date",
      "interval":  "hour"
       }
     }
  }
 }
}

}

Also how do I filter for all PStream values:

{
 "query": {
  "filtered": {
    "query": {
        "query_string": {
        "analyze_wildcard": true,
         "query": "_!exists_:PStream"
        }  
    }
  }
}

}


(system) #5