I'm posting to gather insights about ELSER's language capabilities, particularly its support for languages other than English. My interest lies in understanding how well ELSER handles the following languages:
Dutch (NL)
French (FR)
German (DE)
If ELSER is currently not equipped to support these languages, I'd be curious to learn about any future plans. Specifically, are there any timelines or stages in development aimed at integrating Dutch, French, and German (a roadmap would be ideal) ?
Insights into the challenges and strategies for adapting ELSER to these languages would be highly appreciated.
On the other hand, if ELSER already boasts multilingual support, I'd love to know more about its performance. How does it fare in terms of accuracy and relevance when processing Dutch, French, and German compared to English? I'm keen on understanding any particular strengths or limitations ELSER might exhibit in these languages.
I second this and am interested in the performance in for Italian (IT), as far as I've seen ELSER is english only and the E5 model should be used instead, however, I can only find ELSER in my trained model console
Hello, I can see the E5 model (not the E3) when I check my local dev env with elastic 8.12. It seems only E5 is supported. You can import the E3 model with eland, that should work ?
yea sorry I misspelled it, I meant the E5 model, I can only see two versions of elser and the language detection model, maybe it's because I'm on elastic 8.11 ? I run in an Elastic Cloud hosted deployment
Element type should be float rather than byte as the model produces a float embedding. Set dims to the size of the text embedding, for multilingual-e5-small that is 384
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.