Hello!
I was running Elastic in Windows but now I have to migrate to Ubuntu in a cloud server.
The problem is that I was using FSCrawler to ingest the files to the stack, which runs smooth on Windows. However, there is no DEB/RPM packages yet for Linux, as previously discussed here.
The turn around I found was using the installed and working FSCrawler in Windows and feeding the Linux stack via SSH (available in FSCrawler).
Despite configuring it with SSH, and it being able to read the file in the server, I am getting an error: WARN [f.p.e.c.f.FsParserAbstract] Error while crawling /tmp/sandbox: String index out of range: -1
So, I would like to make two questions:
1 - Do you have any tip/instruction on how to properly install FSCrawler in linux?
2 - What can I do to solve the error?
Today while trying to run it again, it worked. Sending the files hosted in my Windows machine to the cloud.
Anyway, I would like to thank you for the attention and, enjoying the opportunity, ask if the FSCrawler RPM/DEB is in the agenda? This way I'll be able to run it fully in the cloud.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.