Word count from documents

dadoonet · January 29, 2018, 3:28pm

I don't read Python code. So if you could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

The other solution you were saying ingest-attachment, am not familiar on how to do that !!

Not really another solution but part of it. If you want to extract text from a PDF document, you can use:

ingest-attachment: Ingest Attachment plugin | Elasticsearch Plugins and Integrations [8.11] | Elastic
FSCrawler: GitHub - dadoonet/fscrawler: Elasticsearch File System Crawler (FS Crawler)
Apache Tika directly in Java: https://tika.apache.org/

Topic		Replies	Views
Indexing word, pdf documents? Elasticsearch	11	7074	June 9, 2020
Indexing pdf file Elasticsearch	1	351	June 29, 2018
Visualizing the count of words in each document(pdf, word) in kibana Kibana	3	3160	January 30, 2018
Visualizing the count of words in each document(pdf, word) in kibana using FSCRAWLER Kibana	3	1122	January 24, 2018
Need some help with Ingest Attachment plugin Elasticsearch	5	489	April 30, 2018

Word count from documents

Related topics