Similar to carrot2 plugin are there any plugins or tools on top of elk, that can be used to train and execute supervised learning algorithms ? ( preferably java )
I want to classify search results based on training data.
Similar to carrot2 plugin are there any plugins or tools on top of elk, that can be used to train and execute supervised learning algorithms ? ( preferably java )
I want to classify search results based on training data.
The significant_terms
aggregation can be used to extract features given a set of representative data - see https://www.elastic.co/blog/significant-terms-aggregation#classifier
Can you elaborate a bit on your use case? Why do you need to classify search results, would it also be ok to classify documents prior to ingestion?
What kind of classification algorithm would you like to apply to your data?
I'm not aware of any officially supported plugins. A quick search turned up the following results:
Hope this helps and looking forward to hear more on your use case,
Isabel
Thanks for the quick reply, I am going through your nicely written post. Just want to clarify, I don't want to do semantic analysis. In other words, I will input/train to correlate apples with zebras. As in, if I query for 'apples' elasticsearch should return 'zebras' based on my training and statistics. Can you suggest ?
Thanks for the spontaneous reply. It would not be ok to classify prior ingestion, because the results may evolve while using machine learning based on history.
Any classification algorithm ( decision tree ) , essentially I am interested to see ML supervised learning support using elasticsearch.
Thank you for the links I came across them too!
What do you mean - the results may evolve while using machine learning based on history?
Do you mean the results may change as soon as you re-train your classification model? How often would you re-train your model then?
This sounds like a typical use case for synonyms to me.
should have more details on this.
Hope this helps,
Isabel
Depending on your user volumes It might be worth considering on-the-fly analysis rather than pre-computing a limited set of trained responses.
If the query was something you might not have predicted e.g. not just apples
but apples new AR headset
you can run significant terms on the best-matching docs and discover the term iGizmo
(or whatever the product name might be). Doing the analysis on the fly tailors suggestions to the long tail of many and varied queries users provide rather than just the "head of the tail" ones you were able to predict in your training data.
Thanks Mark will think on those lines as well.
Synonym might work, will think on those lines. Thank you for your time!
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.