Elastic ML job errors

Hi!

I'm new here - working with my team to explore the ML offerings in a trial license and I'm running into some errors and questions.

  • Can Anomaly Detection model output be viewed and used in Kibana visualizations? I see an index name listed in the job config but I can't find it anywhere
  • I can't create Data Frame Analytics jobs because I receive an error like this:
    Cannot create data frame analytics [<job-id>] because user <user name> lacks permissions on the indices: {"<job-id>":{"read":true,"create_index":false,"index":false},"<index-pattern>":{"read":true}}
  • When viewing Anomaly Detection results, I seem to frequently receive the following error, but I do not know how to resolve it:
    You can't view anomaly charts for [anomaly job] because an error occurred while retrieving metric data.

Thanks,
Stefan

I can answer this one. When you create a data frame analytics job you need to specify a destination index, and you need to have permission to create that destination index. By default the UI wizard suggests naming the destination index with the same string you choose for the job ID. This is why the error refers to index name being the job ID. But it doesn't have to be called this. So which patterns of index names are you allowed to create indices in? Change the name of the destination index to be covered by this pattern. If you are not permitted to create indices at all then you'll have to ask your administrator to let you create indices for certain patterns of index name, then set the destination index for the job to be one you're allowed to create.

Thanks David,

Your answer did solve that issue, which I thought we had resolved but my new permissions didn't stick so I was confused.

Any idea about my other two issues related to Anomaly Detection output, or anyone that might know?

Thanks,
Stefan

There are some embeddable components that can be used to add certain parts of the ML UI into other Kibana dashboards. The pull requests linked from [ML] Embeddables enhancements · Issue #66553 · elastic/kibana · GitHub have the details.

It is also possible to search the results directly if you have the right index permissions. Results are stored either in .ml-anomalies-shared or in .ml-anomalies-custom-<name> where <name> is the index_name setting of the job. But be aware that granting a user privileges to search these indices directly breaches the space-awareness of ML jobs within Kibana. Any users who can search those indices directly can see results for jobs that are only otherwise visible in other Kibana spaces. So you probably wouldn't want to do this in a multi-user setup where differentiating who can see what is important.

Thanks again David, I was able to request the correct permissions so that I could access those indices. The "can't view anomaly charts" is related to the embeddable components and shows up when they fail to load, but I couldn't figure out why. At least with direct access to the output I can display the anomaly detection output in a traditional visualization.

Following up after spending some time with the anomaly detection output. It is nice that I can access it, but the structure of it makes using it less intuitive than I had hoped (I wanted to display the anomaly score as a timeseries, or a table listing aggregated anomaly scores for various entities).

Ultimately I wish that I could more reliably display the anomaly charts, but I'm guessing that the cardinality of my data is too great which was resulting in the third error I listed above (You can't view anomaly charts for [anomaly job] because an error occurred while retrieving metric data.).

Hi @stefan.elser,

Regarding the error message you get - the UI currently provides a generic message which is indeed not very helpful for troubleshooting. I've created an issue so you can track the progress, but for now, I'd recommend checking the browser's Networks tab to see the actual error response.

Hi @darnautov,

Thank you! That was helpful so that I have a little bit of an idea of what happened. It seems the most common issue is an upstream request timeout. Is there a way I can improve this, or ask the system to wait longer before timing out? I saw a lot of upstream request timeout when creating ML jobs, is it possible that we didn't give the stack enough ML resources?

There are might be several reasons that cause the upstream request timeout. It could be either Kibana, Elasticsearch, or proxy server timeouts (if you're using a cloud deployment). I'd recommend increasing the elasticsearch.requestTimeout in Kibana settings (in the kibana.yml) as the first step.

@darnautov My engineer tried settings that value to 60s, it was previously using the default 30s, but that didn't seem to change anything, and the timeout occurs well before 30s.

Do you know of any other ideas? We aren't using a cloud deployment, but I am accessing the ELK stack via SSH to an air-gapped box.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.