I have a use case where i expect certain files to be received within a particular week of the month. As soon as the file is received an entry is made to a mySQL DB and i read that using logstash. I have also created an index for this data in kibana.
This process has been running for a few months now.
I am trying to use the ML feature in kibana to estimate what files should be received in next week using the forecasting feature. Can this be done?
X-Pack ML can predict occurrences (counts of things), based upon established learned behavior (i.e. normally there is 2 items during this time period, 3 during the next time period, and so on).
However, you are seemingly describing a situation that may:
not actually have enough historical data to learn from - you say " expect certain files to be received within a particular week of the month" and "This process has been running for a few months now" which to me, states that there's only a handful of observations available (<8-12, because there's not that many weeks in a few months). ML, in general, needs a few hundred observations of a time series before computing a reliable model of behavior.
not be apparent that you're expecting a prediction on some attribute of the files (i.e. the number received). Can you clarify what feature of the file(s) you expect?
Create a Single Metric ML job, using the count function and a bucket_span probably somewhere between 1h and 1d (might try two different jobs and see which one is better for your data)
Run the ML job over the entire span of the historical data that you have.
Open the results of that job in the Single Metric Viewer
Click the "Forecast" button and input the number of days in the future you want the forecast.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.