Visualizing the count of words in each document(pdf, word) in kibana using FSCRAWLER

manasguduri · January 24, 2018, 4:25pm

The screenshot I provided is the indexing done by fscrawler.

Suppose, from the screenshot if it says like content contains the words in document, when I try to visualize that in kibana, and I select aggregation as 'Terms' and the 'content' doesn't even show in the split series.

Am trying it for days, please help !

Thanks in advance.

Manas

Nathan_Reese · January 24, 2018, 5:25pm

The content field is getting stored in elastic search as text. Text fields can not be used for terms aggregation. The field must have the keyword mapping type.

The terms aggregation creates a bucket per unique value. A value is not a token from a larger string but rather the entire string.

manasguduri · January 24, 2018, 6:07pm

Hi Nathan, Thanks for responding, actually I have done this indexing using FSCRAWLER.

I am new to Elasticsearch and Kibana. I have then came to know about fscrawler and used it, as it supports pdf indexing. I am not sure how to change the keyword mapping, can u help me with that?

This is a screenshot of my fscrawler .json file. Should I give ""type" : keyword in this file.

Thanks in advance !

Manas

manasguduri · January 24, 2018, 6:11pm

Previously, I used a python script to index the pdf file and it shows keyword mapping as you are saying but it still doesn't take the tokens from the 'file.keyword' but the entire string.

Thanks,
Manas

system · February 21, 2018, 6:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Visualizing the count of words in each document(pdf, word) in kibana Kibana	4	3139	February 27, 2018
How can I ingest PDF and words files and extract keywords of these documents? Elasticsearch	8	3956	June 26, 2018
Word count from documents Elasticsearch	10	7742	February 28, 2018
Visualize pdf and images indexed with fscrawler in kibana dicover section Kibana	12	1292	June 9, 2019
Charts to represent information extraxted from pdf and images with fscrawler, elasticsearch and kibana Elasticsearch	4	416	June 3, 2019

Visualizing the count of words in each document(pdf, word) in kibana using FSCRAWLER

Related topics