Hello, I am currently using the AppSearch Web Crawler and would like to add custom fields to the crawled documents.
Unfortunately, the websites that the web crawler is crawling do not allow the addition of meta tags for custom fields. As a solution, I have created a proxy server that targets the website domains to crawl all web pages under it and add custom fields, following the instructions on Extract custom fields using web crawler and proxy | App Search documentation [8.9] | Elastic.
However, the URL of the document crawler generates points to the proxy server instead of the actual domain.
Additionally, I have about 5 domains, which means that I would need to create 5 proxy servers for adding custom fields. It has become cumbersome for me to manage all of these proxy servers.
Could you kindly provide me with some advice on this matter?
Furthermore, I have considered using Elastic Web Crawler, but it does not have an API, which I require for crawling websites.
Thanks.