Visualizing the count of words in each document(pdf, word) in kibana using FSCRAWLER


(manas) #1

The screenshot I provided is the indexing done by fscrawler.

Suppose, from the screenshot if it says like content contains the words in document, when I try to visualize that in kibana, and I select aggregation as 'Terms' and the 'content' doesn't even show in the split series.

Am trying it for days, please help !

Thanks in advance.

Manas


(Nathan Reese) #2

The content field is getting stored in elastic search as text. Text fields can not be used for terms aggregation. The field must have the keyword mapping type.

The terms aggregation creates a bucket per unique value. A value is not a token from a larger string but rather the entire string.


(manas) #3

Hi Nathan, Thanks for responding, actually I have done this indexing using FSCRAWLER.

I am new to Elasticsearch and Kibana. I have then came to know about fscrawler and used it, as it supports pdf indexing. I am not sure how to change the keyword mapping, can u help me with that?

This is a screenshot of my fscrawler .json file. Should I give ""type" : keyword in this file.

Thanks in advance !

Manas


(manas) #4

Previously, I used a python script to index the pdf file and it shows keyword mapping as you are saying but it still doesn't take the tokens from the 'file.keyword' but the entire string.

Thanks,
Manas


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.