I have a list of different indexes (see image). I would like to join 2 of them and then apply the Machine Learning algorithms.
I want to join: "humhub-2019.04" with "humhub-2019.05"
I thought of saving the search and then calling each of them and using "AND".
Is there other way to join 2 indexes? do I have to use "json"?
Thanks for trying ML....
If you want to run the ML and or queries across multiple indexes like you are displaying you simply create and index pattern see here:
In your case you will create an index pattern like
humhub-* then when you create an ML job you use the index pattern.
Just to help me understand, the term Join is a SQL / RDBMS construct typically with a
WHERE clause that joins data from 2 disparate tables. SQL Joins are not readily supported in Elasticsearch data is typically denormalized in Elasticsearch ... is that what you want to do... if so that is a much more complex question.
In the image I have just uploaded, it can beseen tha I have 8 indeces. But I would like to use use only 2, humhub-2019.04 and humhub-2019.05.
Is there a way to filter that?
By the way, when applying Machine Learning , what is the split rate (train / test)?
First of all, Machine Learning in Elasticsearch today is specifically Unsupervised Time Series Anomaly Detection. Training refers to Supervised Machine Learning which we are not.
Typically for our Machine Learning to "Learn Data" that has periodicity it take at least 3 times the period to learn the data. (dependent on many things)
So if you have a daily pattern it will take at least 3 days, Weekly... at least 3 weeks etc.
And more data is generally better... less data will be less better
This might be a good webinar.
And here is a good overview
So back to the first question while setting up the ML job you could just limit the Date Range of the Data Feed that would probably be easiest / better...
Or you can do some different reindex those indexes into a single index, or reindex into 2 indexes with a different names and create an index pattern for just those 2.
I want to limit the Date Range and Data Feed. But now I am having other problem (check image 1).
Is it related to the "Time Filter field name" in Image 2?
I have used the data despite of the "mapping conflict", and it is working fine.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.