Fields and Row template in Annotations [TSVB]

Hi.
I am using TSVB to visualize the data stored in the machine learning index, I have seen that I can use annotations to visualize the details of the anomaly but when I try to add the causes.typical field it does not generate the annotations in the visualization.
Data:

"_source": {
    "job_id": "customers_fraud",
    "result_type": "record",
    "probability": 0.0017884393731519299,
    "record_score": 0.4412245161339784,
    "initial_record_score": 0.4412245161339784,
    "bucket_span": 3600,
    "detector_index": 0,
    "is_interim": true,
    "timestamp": 1620396000000,
    "function": "count",
    "function_description": "count",
    "over_field_name": "customer_id.keyword",
    "over_field_value": "123456789",
    "causes": [
      {
        "probability": 0.0017884393731519297,
        "function": "count",
        "function_description": "count",
        "typical": [
          4.239593819083309
        ],
        "actual": [
          19
        ],
        "over_field_name": "customer_id.keyword",
        "over_field_value": "123456789"
      }
    ],
    "influencers": [
      {
        "influencer_field_name": "customer_id.keyword",
        "influencer_field_values": [
          "123456789"
        ]
      },
      {
        "influencer_field_name": "payment_method.keyword",
        "influencer_field_values": [
          "CREDIT"
        ]
      },
      {
        "influencer_field_name": "currency.keyword",
        "influencer_field_values": [
          "USD"
        ]
      }
    ],
    "customer_id.keyword": [
      "123456789"
    ],
    "currency.keyword": [
      "USD"
    ],
    "payment_method.keyword": [
      "CREDIT"
    ]
  },
  "fields": {
    "timestamp": [
      "2021-05-07T14:00:00.000Z"
    ]
  },
  "sort": [
    1620396000000
  ]
}

Hi Oscar,

These is an array of causes so I think that using causes.typical won't address any field from there. It would have to be something like causes[0].typical[0] You can try with an without the [0] for typical. I do have doubts that this is a supported scenario, so if you come back and say that it doesn't work like that for you, then I'll open an issue for it in the Kibana repo.

I have reproduced the situation for Oscar and I've tried all kinds of combinations of causes.actual , causes.actual._value , causes.0.actual, and now causes[0].typical[0] but none seem to work.

I'm wondering if this is related to this bug/enhancement: TSVB Make mustache template field accessors consistent · Issue #59435 · elastic/kibana · GitHub

1 Like

Yeah, I think we need to create an issue for it then, just specifically and then it can be linked to a meta issue for more improvements.

I think the issue is primarily because causes is a nested object....

@Oskr - A possible workaround could be the usage of Transforms. In particular, you could use Transforms to re-format the .ml-anomalies-* index into a new, very small index for reporting purposes for TSVB. For example:

PUT _transform/my_ml_annotations
{
  "source": {
    "index": [
      ".ml-anomalies-*"
    ],
    "query": {
      "bool": {
        "filter": [
          {
            "term": {
              "result_type": "record"
            }
          },
          {
            "term": {
              "job_id": "url_scanning"
            }
          },
          {
            "range": {
              "record_score": {
                "gte": "99"
              }
            }
          }
        ]
      }
    }
  },
  "dest": {
    "index": "my_ml_annotations"
  },
  "pivot": {
    "group_by": {
      "timestamp": {
        "date_histogram": {
          "field": "timestamp",
          "fixed_interval": "15m"
        }
      },
      "clientip": {
        "terms": {
          "field": "clientip"
        }
      }
    },
    "aggregations": {
      "record_score": {
        "max": {
          "field": "record_score"
        }
      },
      "typical": {
        "scripted_metric": {
          "init_script": "state.typical = null",
          "map_script": "state.typical = params._source.causes.0.typical.0",
          "combine_script": "return state.typical",
          "reduce_script": "for (d in states) if (d != null) return d"
        }
      },
      "actual": {
        "scripted_metric": {
          "init_script": "state.actual = null",
          "map_script": "state.actual = params._source.causes.0.actual.0",
          "combine_script": "return state.actual",
          "reduce_script": "for (d in states) if (d != null) return d"
        }
      }
    }
  }
}

The above will create a new index called my_ml_annotations that is "flattened" and looks like the following:

Then, I can use it in TSVB:

Of course, you'd need to run the transform "continuously" by defining the frequency and sync:

This is possible, but the correct syntax is indeed very hard to hit:

The fields list only has to mention causes, then in the row template you can get it using this syntax: {{causes.[0].actual.[0]}}

It's weird, I know, but it correctly picks the 19 value for the tooltip

Hi @flash1293 thanks for your help, Can you confirm the version of Kibana you are working on?
I tried to do the visualization in the way you describe and it generated the same problem.

Thanks @richcollier I think it is a good solution although I don't know if it will generate an overload to the cluster in the future. I will propose it to our team and test it.

@Oskr - I'm not sure @flash1293 's solution works specifically with the .ml-anomalies-* index because the way that it is mapped., but we can wait on his clarification (I couldn't get his suggestion to work either). I suspect his test didn't actually use the true .ml-anomalies-* index, but rather a mock-up.

My workaround using Transforms will be incredibly lightweight. Transforms just uses elasticsearch aggregations under the hood and the .ml-anomalies-* index that it is operating on shouldn't be that big in the first place! :wink:

Thanks @richcollier I tried to solve the problem with your tips but they didn't work.

Hi @Marius_Dragomir yes the problem is when trying to generate annotations from arrays.
I tried to solve the problem with your tips but they didn't work.

I'm no ML expert, maybe something special is going on there. TSVB is simply reading the _source from the document, so if the source is available, is should work. Maybe ml anomalies are not storing this part of the source?

The part I'm sure about is the mustache syntax for accessing the first value of an array is path.[0] (not path.0 or path[0] which would make more sense)

Yes, ML stores the causes array in _source with a mapping type of nested so there must be something else going on here.

It turns out this is exactly the situation. The causes array in _source with a mapping type of nested messes up TSVB - as TSVB executes an exists filter on the data:

        {
          "exists": {
            "field": "causes"
          }
        }

which returns nothing.

For now, stick with the transforms workaround. There could also be a possibility of also using a runtime field for items buried in the causes array but I have yet to test that (and perhaps TSVB doesn't support runtime fields until 7.13)

Thank you @richcollier for the explanation.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.