Elasticsearch - index emails and crawl links


I just came across a different requirement from our client. They want to index emails coming from blog sources (telegraph.co, Europol.uk etc.). But, in addition to the email content, if there are any links (hyperlinks to more sources) then Elastic Search should import information from links embedded in the sources.

Is this possible directly, or do I have to extend any plugin/API?


You can use Logstash to get the emails, but it can't crawl sites.