Show the server that is down

Hi,

I am sending following data to elasticsearch from multiple servers.
SysTime, HostName, MemFree, CPULoad

I am sending this data every 5 minutes. That is every 5 minutes some 10 servers in my deployment will send this data to elasticsearch.

If data from any server is missing for last 5 minutes (or say last 6 minutes, to avoid any boundary condition issues) then I want to show it as "down".

I don't want to hard code the list of servers that I would be looking for since I may dynamically start monitoring more servers. Thus, I want to base my logic such that I find a list of distinct servers within last 10 minutes and a list of distinct servers within last 5 minutes and compare. If some server is missing in the last 5 minute list, then I want to show it as "down".

The "show" as down is preferably on Kibana (currently on 5.6, but can go to 6 if needed - same with elasticsearch). If not on Kibana, then I would like to just get that output through some elasticsearch query.

Is it possible to do so?

Thanks so much.

Hi @deuskars,

You could use terms aggregation to do it, please consider this full example:

PUT monitoring/doc/1
{
  "server": "server1",
  "cpu": 20,
  "timestamp": "2017-12-18T00:00:00"
}

PUT monitoring/doc/2
{
  "server": "server2",
  "cpu": 30,
  "timestamp": "2017-12-18T00:00:00"
}

PUT monitoring/doc/3
{
  "server": "server1",
  "cpu": 15,
  "timestamp": "2017-12-18T00:10:00"
}

GET monitoring/_search
{
  "size": 0, 
  "query": {
    "bool": {
      "filter": {
        "range": {
          "timestamp": {
            "gte": "2017-12-18T00:00:00",
            "lt": "2017-12-18T00:10:00"
          }
        }
      }
    }
  },
  "aggs": {
    "servers": {
      "terms": {
        "field": "server.keyword",
        "size": 10
      }
    }
  }
}

GET monitoring/_search
{
  "size": 0, 
  "query": {
    "bool": {
      "filter": {
        "range": {
          "timestamp": {
            "gte": "2017-12-18T00:10:00",
            "lt": "2017-12-18T00:20:00"
          }
        }
      }
    }
  },
  "aggs": {
    "servers": {
      "terms": {
        "field": "server.keyword",
        "size": 10
      }
    }
  }
} 

But, why not using using Heartbeat that already do the hard work with great dashboards?

Cheers,
LG

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.