Web Crawler API

Hi,

Reposting this issue because it did not get answered yet.

I got a couple of questions regarding the Web Crawlers API's (App Search's crawler & Elastic Web Crawler):

  1. Why does the more powerful web crawler (Elastic Web Crawler) not have an API while the App Search crawler does have an API?
  2. Is there any way to configure the crawler without using the GUI? (API calls)
  3. Will there come an API for the Web Crawler in the future?

We just find it strange that this does not have an API, is there any reason why it does not have one?

We were very confused to read this comparison .

Thanks in advance!
Kind regards, Chenko

1 Like

I see Elastic has released the Open Web Crawler, however it is not quiete clear if this one does have an API or not.. I would also like to know if the API is coming in the future for both crawlers.

From Elasticsearch to Elastic Search

Added crawler

Hi @Chenko ,

Sorry we missed your first post and that this one has gone unanswered for so long. We get a lot of discuss forum posts, and route notifications based on "category" and "tag." Looks like your question didn't have quite the right set of metadata to get in front of the team with the answers. :slight_smile:

  1. Why does the more powerful web crawler (Elastic Web Crawler) not have an API while the App Search crawler does have an API?

Good question. The answer won't be satisfying, I'm afraid. We initially made the Elastic Crawler to be able to serve the needs of customers who wanted to use Elasticsearch indices, not App Search Engines. We mostly did a lift-and-shift of the App Search Crawler to accomplish this. However, the APIs didn't make as much sense, as they'd been designed with App Search in mind. We'd meant to come back eventually and build them a more intentional set of APIs, and until then, we didn't want to release them publicly, since we didn't intend for them to live forever.

Of course, the best way to have a feature last forever is to assume it wont. xkcd: Code Lifespan

While we're now getting back to prioritizing the need for programmatic access for our crawler, we're also trying to marry that with some other needed enhancements for our crawler (open code, decoupling from the Enterprise Search application server), so you're more likely to see APIs surface for the Open Crawler at this point than you are for the Elastic Crawler.

Is there any way to configure the crawler without using the GUI? (API calls)

The only supported way to configure the Elastic Crawler today is through Kibana, I'm afraid.

Will there come an API for the Web Crawler in the future?

It's on our vision board for the Open Crawler, but we don't have a committed timeline. In the mean time though, it at least can be controlled programmatically through files and a CLI.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.