Is there any way to extract the results of a ML job(i.e, the detected anomalies in input data) using languages like python with their Elasticsearch client?
All results from ML jobs are stored in the .ml-anomalies-*
indices and therefore are accessible via query from any ES client
I used the following code to fetch the results for the anomaly jobs
doc = {
'size' : 10000,
'query': {
'match_all' : {}
}
}
res=es.search(index='.ml-anomalies-*', body=doc)
I got the following output:
{'_index': '.ml-anomalies-shared',
'_type': '_doc',
'_id': 'abcd_bucket_1560420000000_3600',
'_score': 1.0,
'_source': {'job_id': 'abcd',
'timestamp': 1560420000000,
'anomaly_score': 0.0,
'bucket_span': 3600,
'initial_anomaly_score': 0.0,
'event_count': 68,
'is_interim': True,
'bucket_influencers': ,
'processing_time_ms': 2,
'result_type': 'bucket'}
When I tried to convert the timestamp to date time format, I get the below error:
----> 1 dt=datetime.fromtimestamp(1560420000000)
ValueError: year 51417 is out of range
In what format does the .ml-anomalies-* index store the timestamp value? Is any formatting needed for the timestamp values returned to get the correct date and time?
Does this help?
Internally, dates are converted to UTC (if the time-zone is specified) and stored as a long number representing milliseconds-since-the-epoch.
You should be able to convert from that.
Thank you, I resolved the issue now. I just divided the timestamp by 1000 as it was in milliseconds
Notice that inside the result index, there are a variety of different documents, each with their own usefulness
result_type:bucket
result_type:record
result_type:influencer
More info here: https://www.elastic.co/blog/machine-learning-anomaly-scoring-elasticsearch-how-it-works
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.