Define explicit mapping/settings per job

I wish to index a number of research papers so that I may analyse such features as: word proximity, sentence length, word count, spelling variations etc.
I wish to index each document as a single unit of text together with its filename.
The information below appears applicable, but I would appreciate advice on the content of '_settings.json' and '_settings_folder.json'

https://fscrawler.readthedocs.io/en/fscrawler-2.5/admin/fs/elasticsearch.html?highlight=define%20a%20mapping

Define explicit mapping/settings per job

Let’s say you created a job named job_name and you are sending documents against an elasticsearch cluster running version 6.x .

If you create the following files, they will be picked up at job start time instead of the default ones:

  • ~/.fscrawler/{job_name}/_mappings/6/_settings.json
  • ~/.fscrawler/{job_name}/_mappings/6/_settings_folder.json

Those files are generated automatically when you create the job on the first run.
Then you can edit it to change the path and other settings.

I'm not sure I understood your question right though. What do you not understand?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.