Possible bug in search query for Elasticsearch Version Mismatch alert


(Aaron L) #1

The generated XPack watcher alert that checks for Elasticsearch node version mismatches doesn't seem to ever pull back any data and the way the alert is written will never trigger an alert if there is no data. The input for the alert is a chain search where the first input chain called "check" is querying from the indexes that match the pattern ".monitoring-es-*". It's filtering down records only for a particular cluster by comparing the found documents "_id" field to the cluster's UUID. I believe this is a mistake and should be filtering the documents by the "cluster_uuid" field like the other built in alerts. It seems that there is no document having the same ID as the cluster UUID and thus no records are ever returned thus the alert will never fire under normal circumstances.

We are using Elasticsearch 6.1.2 docker images. I've verified this on the base flavor as well as the platinum flavor.

Given my cluster UUID is 'QwL3uLoyTbiOGrLhFgRqGA', this is the search being performed that returns no hits:

GET .monitoring-es-*/_search
{
  "size": 1,
  "_source": [
    "cluster_stats.nodes.versions"
  ],
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "_id": "QwL3uLoyTbiOGrLhFgRqGA"
          }
        },
        {
          "bool": {
            "should": [
              {
                "term": {
                  "_type": "cluster_stats"
                }
              },
              {
                "term": {
                  "type": "cluster_stats"
                }
              }
            ]
          }
        }
      ]
    }
  },
  "sort": [
    {
      "timestamp": {
        "order": "desc"
      }
    }
  ]
}

which returns:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

Here is what i believe the search should be which does return the needed data for the alert:

GET .monitoring-es-*/_search
{
  "size": 1,
  "_source": [
    "cluster_stats.nodes.versions"
  ],
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "cluster_uuid": "QwL3uLoyTbiOGrLhFgRqGA"
          }
        },
        {
          "bool": {
            "should": [
              {
                "term": {
                  "_type": "cluster_stats"
                }
              },
              {
                "term": {
                  "type": "cluster_stats"
                }
              }
            ]
          }
        }
      ]
    }
  },
  "sort": [
    {
      "timestamp": {
        "order": "desc"
      }
    }
  ]
}

which returns:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 153,
    "max_score": null,
    "hits": [
      {
        "_index": ".monitoring-es-6-2018.02.20",
        "_type": "doc",
        "_id": "Ui9ptGEBP6uiZ7_aPFBd",
        "_score": null,
        "_source": {
          "cluster_stats": {
            "nodes": {
              "versions": [
                "6.1.3"
              ]
            }
          }
        },
        "sort": [
          1519150251083
        ]
      }
    ]
  }
}

(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.