Supervised machine learning plugins or tools for elasticsearch?

Similar to carrot2 plugin are there any plugins or tools on top of elk, that can be used to train and execute supervised learning algorithms ? ( preferably java )

I want to classify search results based on training data.

The significant_terms aggregation can be used to extract features given a set of representative data - see https://www.elastic.co/blog/significant-terms-aggregation#classifier

Can you elaborate a bit on your use case? Why do you need to classify search results, would it also be ok to classify documents prior to ingestion?

What kind of classification algorithm would you like to apply to your data?

I'm not aware of any officially supported plugins. A quick search turned up the following results:

Hope this helps and looking forward to hear more on your use case,
Isabel

Thanks for the quick reply, I am going through your nicely written post. Just want to clarify, I don't want to do semantic analysis. In other words, I will input/train to correlate apples with zebras. As in, if I query for 'apples' elasticsearch should return 'zebras' based on my training and statistics. Can you suggest ?

Thanks for the spontaneous reply. It would not be ok to classify prior ingestion, because the results may evolve while using machine learning based on history.

Any classification algorithm ( decision tree ) , essentially I am interested to see ML supervised learning support using elasticsearch.

Thank you for the links I came across them too!

What do you mean - the results may evolve while using machine learning based on history?

Do you mean the results may change as soon as you re-train your classification model? How often would you re-train your model then?

This sounds like a typical use case for synonyms to me.

https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-tokenfilter.html

should have more details on this.

Hope this helps,
Isabel

Depending on your user volumes It might be worth considering on-the-fly analysis rather than pre-computing a limited set of trained responses.
If the query was something you might not have predicted e.g. not just apples but apples new AR headset you can run significant terms on the best-matching docs and discover the term iGizmo (or whatever the product name might be). Doing the analysis on the fly tailors suggestions to the long tail of many and varied queries users provide rather than just the "head of the tail" ones you were able to predict in your training data.

Thanks Mark will think on those lines as well.

Synonym might work, will think on those lines. Thank you for your time!