Hi,
I have placed the jai-imageio-core-1.3.0.jar and the jai-imageio-jpeg2000-1.3.0.jar into the LIB directory for FSCrawler, however it does not get rid of the warning saying J2KImageReader not loaded?
Why would this be?
Hi,
I have placed the jai-imageio-core-1.3.0.jar and the jai-imageio-jpeg2000-1.3.0.jar into the LIB directory for FSCrawler, however it does not get rid of the warning saying J2KImageReader not loaded?
Why would this be?
Yeah. That's a bug. See
Okay, thanks! Any idea when this might be fixed?
We were looking at adding to the LIB folder because some of our PDF files won't index, yet some do, so we can tell why that is and thought that doing this would fix this issue. Is this a possible cause or are we looking in the wrong direction.
That's may be another problem. Did you activate ocr?
as far as I know we did our settings.yaml OCR section looks like this:
ocr:
language: "eng"
enabled: true
pdf_strategy: "ocr_and_text"
Could you run FSCrawler with
--debug --restart --loop 1
options with a single pdf file in the dir your job is watching.
we ran something similar, fscrawler files_corp_tech --debug > debug_tech_output.txt and for what ever reason it indexed them fine this time around.
17:21:45,468 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [false], filename = [/STR/Reference Materials/103 - TIMBER/GENERAL/CWC WOOD DESIGN MANUAL/Wood Desin Manual 2017 pswd.txt],
However this file does not index. What does the directory = [false] mean?
It means that the file is actually a file and not a directory.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.