Website crawl and index into Elasticsearch

Hi,

I am quite new to ElasticSearch, have prior experience in Solr.
My use case is that we have a intranet website with documents attached as links in the web pages.
I have been instructed to create a search architecture that would be able to search through the webpages as well as attached documents.
The problem is I have not been able to find a way to index this website data into Elasticsearch.
Please can someone guide me in the right direction.

Thanks,
Venture M.

A lot of people use Apache Nutch for this.

Thanks for your response.
Is there a tutorial that I can use for POC ?

Best Regards,
Venture

Probably, your favourite search engine can find it for you.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.