Which web crawler works best with ES

(Cody) #1

Hey everyone,

I am evaluating which crawler to use with ES. Do you guys have any experience or suggestions? Just did some research, and I found choices include Nutch and River Web. Personally, I don't want to involve another software such as HBase.


(Bruce) #2

Did you ever get a crawler working with ES?

I hear Nutch works with ES 2.X but I'm not really interested in going backwards.

(Cody) #3

I ended up using nutch. And yes , nutch only works with 2.3 at this point. Since 2.3 has all functions I need, I'm fine with it. I recently found scrapy very powerful. May be worth a try and writing some indexing code on your own though.

(system) #4