Web Crawler Failed HTTP request: Unable to request "< domain >" because it resolved to only private/invalid addresses

jerrac · April 19, 2021, 10:00pm

When I try to run the web crawler against a site we host, it fails with this error:

Failed HTTP request: Unable to request "< domain >" because it resolved to only private/invalid addresses

The site in question would resolve to a 10.n.n.n ip address. Is the crawler configured to reject that? Is there a way to override that behavior?

jerrac · April 19, 2021, 10:12pm

Not sure it's related, but if I target my personal site, not hosted internally, it fails as well.

In the logs I see:

Allow none because robots.txt responded with status 599

and

Failed HTTP request: Remote host terminated the handshake

That also happens if I target Elastic Blog: Stories, Tutorials, Releases | Elastic Blog.

I double checked my personal site's robot.txt file. It's the default Drupal 8 robots.txt file. So there shouldn't be anything in it that would completely block the crawler.

Anyway, I'm glad this still beta.

orhantoy · April 20, 2021, 9:02am

Yes, that's the current, default behavior and it will become configurable in the next minor release.

As for the other issue you're experiencing, it sounds like you can't crawl any site at all, is that correct?

jerrac · April 20, 2021, 3:29pm

Nice.

Yep, can't crawl my personal site, or Elastic Blog: Stories, Tutorials, Releases | Elastic Blog. Haven't tried any other sites yet.

system · May 18, 2021, 3:30pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Web Crawler Failed HTTP request: Unable to request “< domain >” because it resolved to only private/invalid addresses v2 Elastic Search	3	771	November 4, 2022
Add a domain to get started - validate domain - failed Elastic Search	2	717	November 4, 2022
Allow only localhost access Elasticsearch	2	1367	July 6, 2017
Crawler IP Address or Block Elastic Search elastic-site-search	1	1110	January 31, 2020
Not able to make elasticsearch work on our website Elasticsearch	7	1400	July 5, 2017

Web Crawler Failed HTTP request: Unable to request "< domain >" because it resolved to only private/invalid addresses

Related topics