Great that you found what the problem was and shared the solution. I meant when we spoke on Github that you open a new issue on Github, not specifically here but that's fine as it's not lost anywhere.
For your next post, please don't post images of text as they are hard to read, may not display correctly for everyone, and are not searchable.
Actually i seem to have bumped into another error:
10:23:48,194 INFO [f.p.e.c.f.c.BootstrapChecks] Memory [Free/Total=Percent]: HEAP [418.8mb/7.1gb=5.75%], RAM [9.9gb/31.9gb=31.16%], Swap [7gb/39.9gb=17.64%].
10:23:48,529 INFO [f.p.e.c.f.c.FsCrawlerCli] attributes_support is set to true but getting group is not available on [windows server 2016].
10:23:48,540 INFO [f.p.e.c.f.FsCrawlerImpl] attributes_support is set to true but getting group is not available on [windows server 2016].
10:23:49,086 INFO [f.p.e.c.f.c.v.ElasticsearchClientV7] Elasticsearch Client for version 7.x connected to a node running version 7.5.0
10:23:49,147 INFO [f.p.e.c.f.FsCrawlerImpl] Starting FS crawler
10:23:49,147 INFO [f.p.e.c.f.FsCrawlerImpl] FS crawler started in watch mode. It will run unless you stop it with CTRL+C.
10:23:49,353 INFO [f.p.e.c.f.FsParserAbstract] FS crawler started for [hvr] for [F:\hvr_copy] every [1m]
10:23:49,681 WARN [o.a.t.p.PDFParser] J2KImageReader not loaded. JPEG2000 files will not be processed.
See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
for optional dependencies.
10:24:50,586 WARN [f.p.e.c.f.FsParserAbstract] Can't find stored field name to check existing filenames in path [F:\hvr_copy\00\2a]. Please set store: true on field [file.filename]
10:24:50,586 WARN [f.p.e.c.f.FsParserAbstract] Error while crawling F:\hvr_copy: Mapping is incorrect: please set stored: true on field [file.filename].
Do i need to create a elasticsearch schema mapping prior to crawling data to an index?
This probably means that the index has been created before FSCrawler started. And created with an incompatible mapping. If you don't want to do anything specific with the mapping and you don't care of the existing data, just delete the index:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.