Web crawler (github.com/elastic/crawler) to only fetch specific URLS

This fixed it: How to index only given urls in the Elasticsearch using Open Crawler - #2 by nfeekery

You need to:

  • Have seed_urls as the urls that you want to sync
  • sitemap_discovery_disabled: true
  • max_crawl_depth: 1
1 Like