Visualizing the count of words in each document(pdf, word) in kibana


(manas) #1

Hello All,

I have to visualize the count of each word in a document and a count of word appearance in each document individually !

Is that possible in kibana ?

Please, respond fast, I have been strucked for days in here.

Thanks in advance.

Regards,
Manas


Visulizing the count of words in a pdf
(Marius Dragomir) #2

Hello

How do your documents look like indexed in Elasticsearch?
For each word in a document you can use the Terms aggregation on the field that contains the text of your document. This will work only if the field was analysed by Elasticsearch at index time.
And you want to count a specific word across different documents, you just filter on that word from list that you get from the terms aggregation.
You can check the Shakespeare example in the Kibana Getting Started tutorial: https://www.elastic.co/guide/en/kibana/current/getting-started.html


(manas) #3

Hello Marius,

Initially I have used a python code to to index my pdf file and recently I used fscrawler, both seems to be working and indexing according to the fields, I have provided a screenshot.

The screenshot I provided is the indexing done by fscrawler.

Suppose, from the screenshot if it says like content contains the words in document, when I try to visualize that kibana it doesn't even show in the split series.

For Example, I indexed my file using a python script, title.keyword is the title name and file.keyword is the words in the file. if I select title.keyword for vertical bar graph it shows the names of the different files i use and there count, so in the same way if I select the file.keyword it shows nothing as it takes the whole file.keyword as one thing.

I have tried in multiple ways but didn't succeed, please help!

Thanks in advance.

Manas.


(manas) #4

Hello Marius,

I can be able to see the count of documents containing a specific word using 'terms' aggregation and then 'filter' the word. But, that is not i want.

I want something like

Doc 1 - text text text good

Doc 2 - good one

then count of words as:

text - 3
good - 2
one - 1

According to what you are saying it will result like:

text - 1
good - 1
One - 1

Thanks.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.