Elastic Web crawler not able to parse complete html of a page, other Search engine able to crawl , elastic misses / ignore many sections of page

We are trying to crawl a site using elastic enterprise app search web crawler. We are facing a problem that full content of the page is not getting indexed. In most of the pages the indexing stops after a certain point and rest of html is ignored, In some pages it crawls some section , skips some then again crawls rest of the webpage sections. This is happening for the body content. We do not know if this is due to a malformed html, how do we identify that. Is there any option or config to fix this.

While the existing search solution that uses Adobe S&P is able to search and index all contents.

we are using 7.16.1 version

We are facing this issue with many pages in this site.

Need urgent help

Hello, Sanjeev

Sorry you are experiencing the issues with your web crawler. Can you please share a link to the pages that fail to index? It’d help us troubleshoot content extraction for you.

Additionally, you can use the content extraction API to see what the crawler sees on a specific page: Web crawler API reference | Elastic App Search Documentation [8.0] | Elastic