App Search not chasing HTTP 302s when validating URLs?

When adding a new URL to the web crawler the validation fails with this error:

When we fetched the web page at [my domain], it returned data that was not HTML.

If I curl -I [my domain] from the pod's shell I see that the response is an HTTP 302 with a Location header.

If I curl [my domain] I get the following text:

<html><body>You are being <a href="[my domain]/[path]">redirected</a>.</body></html>

And if I curl [my domain]/[path] I get valid HTML.

I'm going to specify the entry point for this domain once its added but I can't get that far in the wizard.

How can I add this site to the crawler?

Hi @lgoolsby !

I would recommend using the new Enterprise Search 7.15 version. This version adds some enhancements to the way the Web Crawler handles redirections for the domain pages.

Could you please give it a go and share what your experience is?

In case this is not possible, it would be great to have more details:

  • Your domain, in case it can be shared with us
  • The full cURL requests - responses for your domain

Thanks!

Yep, that fixed the problem. Thanks!

1 Like