the documentation describes you can start a webcrawler per search engine via the API. That is really cool because you can connect your CMS with the search engine.
What our customer would really like is if they update 1 page on their website, that the crawler quickly scans just that one page to include that new page in the search results. They of course want to automate that via the API. So, if they publish or update a page in their CMS, the API call should tell the webcrawler to just quickly scan that one page.
Hi That's not currently possible but it's an interesting scenario. Is the idea that you want the search engine to be updated quicker instead of having to wait for a full crawl of the entire website?
Hi, yes. That is exactly the use case. They don't want to wait till the crawl is complete. That takes about 1 hour. Especially because they are migrating urls from old CMS to new CMS they want the search results to point to the new url as much in real time as possible.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.