Auto detect language

(pran) #1

I am crawling content and using Tika parser to extract the content meta data and body of content(doc, pdf, excel etc) I need to send it to elasticsearch to index. i have a field called ;language'.where I need to specify the language of the content. I have two fields 'content' and 'title' on which I need to detect the language. is there any way I can do it?

(Jörg Prante) #2

You can try my plugin

(system) #3