Word count from documents

manasguduri · January 30, 2018, 1:12am

I have tried this:

I have used tika with the python code i shared and it takes the data as '.keyword', but it doesn't show the count of individual words in a pdf file.
I have used fscrawler, it takes the data as content and not as '.keyword' format, so even the field doesn't show in visualization tab.

Using ingest plugin, am still working on it, am not exactly finding a way to index a pdf file, am going through lot of issues. Will work on that.

You asked me to provide a script but from the types I went through doesn't require them. All I need to do is give the directory name in which files are stored, then it will do the work for me !

I have been working on this for days now, and I lost my belief that elasticsearch will be able to individually count the words in a pdf file.

Can you please give me some references where someone had did it really, becaause I don't want to waste anymore time on this !

You are my only hope. Please help !

Regards,
Manas

Topic		Replies	Views
Indexing word, pdf documents? Elasticsearch	12	7059	July 7, 2020
Indexing pdf file Elasticsearch	2	347	July 27, 2018
Visualizing the count of words in each document(pdf, word) in kibana Kibana	4	3153	February 27, 2018
Visualizing the count of words in each document(pdf, word) in kibana using FSCRAWLER Kibana	4	1115	February 21, 2018
Need some help with Ingest Attachment plugin Elasticsearch	6	487	May 28, 2018

Word count from documents

Related topics