How to find data anomalies?

We have various indexes in Elasticsearch and I wonder if it'd be possible to find anomalies in the data?

For example:
We have a document for each test we run. The document has the test name as one field and the result (pass/fail) as another. How can I find the tests that frequently change their result most frequently?

More complex case: We also have the commit that we ran the test on in a separate field. Can I find test name and commit combinations where the results change?

Another example:
We record API response times and I want to find the queries where the response time has a large variance and find what could cause that variance (e.g. an unhealthy k8s pod).

Welcome to our community! :smiley:

Take a look at Anomaly detection | Machine Learning in the Elastic Stack [8.2] | Elastic :slight_smile:

Hi Mark! Can you give me more guidance? To be honest, machine learning seems like an overkill for the first example I gave.

It's not overkill, that's how anomaly detection is done in our stack.

How could I make this work with Machine Learning in ES?

Do you mean that the number of transitions for pass-to-fail or fail-to-pass is what is interesting here? For example, if in a particular time frame:

Test A: pass, fail, pass, fail, pass, fail, pass = 6 state changes
Test B: pass, pass, pass, pass, pass, pass = 0 state changes
Test C: fail, fail, fail, fail, fail, fail = 0 state changes

Then you're interested in highlighting Test A because it is "flip-flopping"?

Pleases clarify

Exactly Rich, I'm looking to find tests that "flip-flop" the most. I'd love to be able to visualize the "top flip-flopping tests":

Test Foo: 6 state changes
Test Bar: 2 state change
Test Baz: 1 state change

Ok, that clarification helps. Can you also tell me how often the tests run? At a predictable interval?

It's not a predictable interval, but we don't have to be extremely accurate on the timing. The important part is that we would be able to look at flips for a certain commit and test combination, preferably aggregated by test name (and not the commits).

Ok - so I think this should be done with a Transform, which will pivot on the test name and then use a scripted_metric aggregation to gather up all of the individual test runs, sort them by time, then count the number of times the test result "flips state" (from "pass to fail" or "fail to pass").

Here's an example based on the other examples here:

Note: this uses the _preview endpoint of the transform, but you can have the Transform persist the results to an index to use for reporting, etc.

First, I define an index to use

PUT testruns/
{
  "mappings": {
    "properties": {
      "testrundate": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
      },
      "testnamne": {
        "type": "keyword"
      },
      "result": {
        "type": "keyword"
      }
    }
  }
}

Next, I put some sample docs in there:

PUT testruns/_doc/1
{
  "testrundate": "2022-05-20 12:00:00",
  "testnamne": "foo",
  "result": "pass"
}

PUT testruns/_doc/2
{
  "testrundate": "2022-05-20 13:00:00",
  "testnamne": "foo",
  "result": "fail"
}

PUT testruns/_doc/3
{
  "testrundate": "2022-05-20 14:00:00",
  "testnamne": "foo",
  "result": "pass"
}

PUT testruns/_doc/4
{
  "testrundate": "2022-05-20 13:00:00",
  "testnamne": "bar",
  "result": "fail"
}

PUT testruns/_doc/5
{
  "testrundate": "2022-05-20 14:00:00",
  "testnamne": "bar",
  "result": "fail"
}

PUT testruns/_doc/6
{
  "testrundate": "2022-05-20 11:00:00",
  "testnamne": "foo",
  "result": "fail"
}

Now I run the Transform:

POST _transform/_preview
{
  "source": {
    "index": [
      "testruns"
    ]
  },
  "pivot": {
    "group_by": {
      "testnamne": {
        "terms": {
          "field": "testnamne"
        }
      }
    },
    "aggregations": {
      "num_results": {
        "value_count": {
          "field": "result"
        }
      },
      "flipflops" :
      {
        "scripted_metric": {
          "init_script": "state.docs = []",
          "map_script": """
              Map testrun = [
              'testrundate':doc['testrundate'].value,
              'testnamne':doc['testnamne'].value,
              'result':doc['result'].value
            ];
            state.docs.add(testrun)
          """,
          "combine_script": "return state.docs;",
          "reduce_script": """ 
            def all_docs = [];
            for (s in states) {
              for (testrun in s) {
                all_docs.add(testrun);
              }
            }
            all_docs.sort((HashMap o1, HashMap o2)->o1['testrundate'].toEpochMilli().compareTo(o2['testrundate'].toEpochMilli()));
            def size = all_docs.size();
            def min_time = all_docs[0]['testrundate'];
            def max_time = all_docs[size-1]['testrundate'];
            def duration = max_time.toEpochMilli() - min_time.toEpochMilli();
            def last_testresult = '';
            def flipflopcount = 0L;
            for (s in all_docs) {if (s.result != (last_testresult)) {flipflopcount++; last_testresult = s.result;}}
            def ret = new HashMap();
            ret['first_time'] = min_time;
            ret['last_time'] = max_time;
            ret['duration_minutes'] = duration/1000/60;
            ret['flipflopcount'] = flipflopcount-1;
            return ret;
          """
        }
      }
    }
  }
}

And the output is:

  "preview" : [
    {
      "testnamne" : "bar",
      "num_results" : 2,
      "flipflops" : {
        "duration_minutes" : 60,
        "first_time" : "2022-05-20T13:00:00.000Z",
        "last_time" : "2022-05-20T14:00:00.000Z",
        "flipflopcount" : 0
      }
    },
    {
      "testnamne" : "foo",
      "num_results" : 4,
      "flipflops" : {
        "duration_minutes" : 180,
        "first_time" : "2022-05-20T11:00:00.000Z",
        "last_time" : "2022-05-20T14:00:00.000Z",
        "flipflopcount" : 3
      }
    }
  ],

This makes sense as test foo had 4 runs and transitioned 3 times and test bar ran twice and never transitioned (it was always failing)

1 Like