Create machine learning job for trend analysis

I have a lot of data into elasticsearch with the tickets solded for men, women and children every day like the following:

{
  "_index": "logstash-people-2020.01.22",
  "_type": "_doc",
  "_id": "I7xKznAB6obTCirlmSMC",
  "_version": 1,
  "_score": null,
  "_source": {
    "men": 111,
    "@timestamp": "2020-01-22T12:00:00.000Z",
    "path": "/tmp/people.json",
    "@version": "1",
    "tags": [
      "multiline"
    ],
    "country": "US",
    "children": 127,
    "host": "ubuntu",
    "message": "{ \"date\": \"2020-01-22 13:00:00\",\n\"country\": \"US\",\n\"men\": 111,\n\"women\": 26,\n\"childrens\": 127\n}",
    "women": 26,
    "date": "2020-01-22 13:00:00"
  },
  "fields": {
    "@timestamp": [
      "2020-01-22T12:00:00.000Z"
    ]
  },
  "sort": [
    1579694400000
  ]
}

I would like to create with machine learning a sales forecast for the next week using dataset available into elasticsearch. How can I do?

Yes, you can try using Machine learning - create a job and detext

If you have a basic license, you can use the Data Visualizer to learn more about your data. In particular, if your data is stored in Elasticsearch and contains a time field, you can use the Data Visualizer to identify possible fields for anomaly detection.
You can also upload a CSV, NDJSON, or log file (up to 100 MB in size). The Data Visualizer identifies the file format and field mappings. You can then optionally import that data into an Elasticsearch index.

You need the following permissions to use the Data Visualizer with file upload:

  • cluster privileges: monitor , manage_ingest_pipelines
  • index privileges: read , manage , index

More information can be found here: https://www.elastic.co/guide/en/kibana/current/xpack-ml.html

The Elastic machine learning anomaly detection feature automatically models the normal behavior of your time series data — learning trends, periodicity, and more — in real time to identify anomalies, streamline root cause analysis, and reduce false positives.

cc @Peter_Harverson can plz shed more light here.

Thanks
Rashmi

A good place to start would be to create some single metric and multi metric anomaly detection jobs using the ML UI in Kibana - see this page in the docs for an overview. For example, creating three single metric jobs to model the ticket sales for men, women and children, using sum(men), sum(women) and sum(children). Or you could combine these into a single job using the multi metric wizard, with three detectors sum(men), sum(women) and sum(children). A bucket span of 1 day would seem suitable, although this would depend on how far back in time your data set goes.

Having run these jobs to learn the trends in your data, you can then run a forecast for the following week from the Single Metric wizard. This blog discusses the forecasting functionality in more detail.

I see there is also a country field in your data, so you may want to explore the affect of this variable, as well as the time factor, if your sales data is split across different countries. For this, you could try using a regression analysis to investigate the relationship between different fields in your sales data.

Hope that helps.
Pete

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.