I'm getting a "Unexpected error while running the crawl, check system logs for details" in the crawler logs when trying to crawl a list of 1000 sites. Does anyone know which logs I should check as this is the crawler log after all?
If you could let me know your Deployment ID I could take a peek for you. Alternatively you can open a support case. We're working on surfacing more helpful errors to the operator, which are coming soon.
Just wanted to confirm receipt of your Deployment ID info over message. I'm going to get one of the Crawler team members to take a peek when they get a moment. I was able to confirm seeing the "Unexpected error ... " log message, but I don't have a good idea on how to resolve just yet.
Also, I should've asked for clarification on this earlier, but when you say:
... when trying to crawl a list of 1000 sites
Do you really mean you've configured 1000 domains to be crawled, or rather did you mean you're crawling a domain that you expect to have about 1000 pages contained within it?
Ah, my team member remembers fixing what this issue likely is. Expect a resolution in App Search 7.12.
In the meantime, the problem occurs when the crawler is crawling sites with invalid links. We see it most often when there's a malformed <link ... /> tag.
I'll message you the last crawled domain I see in the logs before the latest "Unexpected error" message. However, it's not clear to me that it's guaranteed to be the culprit domain.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.