New User - too dumb to create first index - please help

(David) #1

I would like to experiment with elastic before using it for a research project I am developing. The project data is contained in unstructured pdf files, but for this exercise I am using Wikipedia.

My computer is running Windows 10.

I have downloaded wikipedia as html files into a folder called '' on a usb drive called 'brown3' connected to a synology NAS drive called 'synology2'.

The path to my folder is therefore \synology2\brown3\

I have installed kibana 6 with the ingest plugin

I have read countless pages on and viewed loads of youtube videos but have been unable to translate their general advice to my specific needs and so I am still unable to get elastic/kibana to index my files.

Though the text on wiki pages is organized using headings, sub-headings, numbered lists and bulleted lists, I want all the text to be indexed as just text, to match my unstructured pdf files.

I would greatly appreciate your help.


(Mark Walkom) #2

If I can suggest, you're probably better off starting simpler. Check out

Because otherwise you will need a way to read the files off disk, process them how you want and index them to Elasticsearch. It sounds simple but it's not given your first starting.

(David Pilato) #3

You can also give a look at FSCrawler project:

But as Mark said, start with something even more easy. :wink:

(David) #4

Thank you both.

I've been reading the FScrawler web page.

My operating system is on my computer's C drive. My pdf files (to be indexed) are on one NAS. I would like my index to be created on another NAS.

Kibana is stored in c:/kibana

Where should I place my FScrawler snapshot ?

If on the c. drive then how/where to specify the folder containing my pdf files ?


(David Pilato) #5

Where should I place my FScrawler snapshot ?

Wherever you want.

how/where to specify the folder containing my pdf files ?


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.