But somehow if I change the update_rate, it remains on 15 minutes after restart?
Is my config in the correct format? Seems like update_rate is assigned to server.
Hey @dadoonet, I'm experiencing another issue with the url .
If the url contains a subfolder, there is a exception raised.
I got the following structure:
TestData
| - TestDataSubFolder
url: "/D:/TestData"
Exception:
11:00:40,889 WARN [f.p.e.c.f.FsParserAbstract] Error while crawling /D:/TestData: String index out of range: -1
11:00:40,889 WARN [f.p.e.c.f.FsParserAbstract] Full stacktrace
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.substring(Unknown Source) ~[?:1.8.0_202]
at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.indexDirectory(FsParserAbstract.java:541) ~[fscrawler-core-2.7-SNAPSHOT.jar:?]
at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.addFilesRecursively(FsParserAbstract.java:289) ~[fscrawler-core-2.7-SNAPSHOT.jar:?]
at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.run(FsParserAbstract.java:149) [fscrawler-core-2.7-SNAPSHOT.jar:?]
at java.lang.Thread.run(Unknown Source) [?:1.8.0_202]
11:00:40,905 DEBUG [f.p.e.c.f.FsParserAbstract] Fs crawler is going to sleep for 15m
It works if there isn't a subfolder. Quite strange
Edit:
Somehow the same exception is now also suddenly beeing raised if there isn't a subfolder
doesn't work anymore...
Im quite confused. Aber looking up the code and reproduce especially this line:
I always receive -1 for path.lastIndexOf which means the separator couldn't be find.
So I designed my String like this: \D:\TestData and lastIndexOf delivers the correct index.
Now im confused on how this worked before or if i had changed something which is now throwing errors...
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.