Crawling with ElasticSearch

(peterbailey) #1

Hi all. I am just starting to look at ElasticSearch, and am wondering about crawling some large existing sites, as well as doing programmatic updates to indexes as we add content.

We are a Rails/Ruby shop. I guess we can write this from scratch (using retire, or the new elasticsearch gem), but looking through some older posts, there are a lot of recommendations for things like nutch. I am probably just going to crawl our sites once to build the initial indexes, and subsequently use calls from our rails site to adjust the indexes as appropriate.

Does anyone have any good solutions for these use cases. I don't mind writing some code for this, but don't want to reinvent the wheel...

Thanks for listening,


(system) #2