Elasticsearch Transfer Physical Files

Hi,

First time poster. I would like to know if I could use Elasticsearch to transfer a physical file, such as a Word document or a pdf, from the client to server? Any API and/or plug-in there are available to do this?

Thanks in advance!
Eric

No there's not.

You can look at the FSCrawler project which provides such an upload mechanism with the REST feature.

Thanks all for the reply!

Dadoonet, does the FSCrawler works with the latest version of Elasticsearch (5.2)? If not do I have to user an older version?

It works with 1.x to 5.x (and probably 6.0 as well :))

Hi David, I tried to install FSCrawler on Windows 8 by following the instructions on Github. When I entered "bin/fscrawler job_name" in the command the whole process just pause. No prompt no message nothing. Then if I quit it (Ctrl + C), I got the error:
Exception in thread "main" java.util.NoSuchElementException
at java.util.Scanner.throwFor(Scanner.java:862)
at java.util.Scanner.next(Scanner.java:1371)
at fr.pilato.elasticsearch.crawler.fs.FsCrawler.main(FsCrawler.java:212)

I do have JAVA_HOME set up in my environment variable and it is pointing at: C:\Program Files\Java\jdk1.8.0_121.

Thanks!

Can you try to change Java_Home to a directory which has no space in its name? Might be a bug.

Oh actually use the latest snapshot. I fixed some bugs related to windows

Thanks! I got that working fine.

Another question: I am looking at this:

echo "This is my text" > test.txt
curl -F "file=@test.txt" "http://127.0.0.1:8080/fscrawler/_upload"

Can I make an HTTP POST to http://127.0.0.1:8080/fscrawler/_upload in Visual Studio? Also is that i pass the file object, which is uploaded by the client, to the parameter "file" in your case?

I have no idea about Visual Studio TBH.

I think I can have that figure out by myself. thanks for the reply!

In order to update or delete a file, should I call a _delete /_update, just like I do with _upload? For example: http://127.0.0.1:8080/fscrawler/_update, or http://127.0.0.1:8080/fscrawler/_delete?

No. Not implemented. Can you open an issue so I'll add it ?

Where does FSCrawler put the physical file after calling "http://127.0.0.1:8080/fscrawler/_upload"? I thought it will be in "c:\tmp\es" but it was not there.

For now it doesn't put the file anywhere. It just index its content in elasticsearch

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.