Extract the results of a ML job

Is there any way to extract the results of a ML job(i.e, the detected anomalies in input data) using languages like python with their Elasticsearch client?

All results from ML jobs are stored in the .ml-anomalies-* indices and therefore are accessible via query from any ES client

I used the following code to fetch the results for the anomaly jobs
doc = {
'size' : 10000,
'query': {
'match_all' : {}
res=es.search(index='.ml-anomalies-*', body=doc)

I got the following output:

{'_index': '.ml-anomalies-shared',
'_type': '_doc',
'_id': 'abcd_bucket_1560420000000_3600',
'_score': 1.0,
'_source': {'job_id': 'abcd',
'timestamp': 1560420000000,
'anomaly_score': 0.0,
'bucket_span': 3600,
'initial_anomaly_score': 0.0,
'event_count': 68,
'is_interim': True,
'bucket_influencers': ,
'processing_time_ms': 2,
'result_type': 'bucket'}

When I tried to convert the timestamp to date time format, I get the below error:
----> 1 dt=datetime.fromtimestamp(1560420000000)

ValueError: year 51417 is out of range

In what format does the .ml-anomalies-* index store the timestamp value? Is any formatting needed for the timestamp values returned to get the correct date and time?

Does this help?

Internally, dates are converted to UTC (if the time-zone is specified) and stored as a long number representing milliseconds-since-the-epoch.


You should be able to convert from that.

Thank you, I resolved the issue now. I just divided the timestamp by 1000 as it was in milliseconds

Notice that inside the result index, there are a variety of different documents, each with their own usefulness


More info here: https://www.elastic.co/blog/machine-learning-anomaly-scoring-elasticsearch-how-it-works