Word count from documents

I have tried this:

  1. I have used tika with the python code i shared and it takes the data as '.keyword', but it doesn't show the count of individual words in a pdf file.

  2. I have used fscrawler, it takes the data as content and not as '.keyword' format, so even the field doesn't show in visualization tab.

  1. Using ingest plugin, am still working on it, am not exactly finding a way to index a pdf file, am going through lot of issues. Will work on that.

You asked me to provide a script but from the types I went through doesn't require them. All I need to do is give the directory name in which files are stored, then it will do the work for me !

I have been working on this for days now, and I lost my belief that elasticsearch will be able to individually count the words in a pdf file.

Can you please give me some references where someone had did it really, becaause I don't want to waste anymore time on this !

You are my only hope. Please help !

Regards,
Manas