Web crawler not extracting custom fields

I'm following the instructions at Web crawler reference | Elastic App Search Documentation [8.4] | Elastic to get some custom fields crawled on our site, but they are not being picked up. I initially deployed with data-swiftype-name attributes, but have recently changed to data-elastic-name and re-indexed. butit made no difference. Pages on the site with the tags are definitely being crawled according to the data in Kibana, but the documents produced don't have any custom fields. Is there any way to get more information about why the fields are getting missed?

Additionally, I must admit that I'm a little confused by versions of Elastic search as I've not been involved in the setup. I log onto a UI with an elastic-cloud URL, and inside it says App Search v7.12.0. Under there I set myself up an engine and configured the web crawler.

Thanks.

Hi Richard,

Welcome to the Elastic community! The issue you're running into is related to the version discrepancy you mentioned. Custom field meta tag extraction is a newly released feature in 7.13, and your deployment will need to be upgraded to that version to utilize it.

Wow, thanks for the quick response. We are starting an upgrade now. I also just see now that the version number is right there in the documentation link. Should have spotted that, sorry :blush:

1 Like

Hi again, our crawler is indexing documents with customer fields nicely now except for one and I can't work out why. Here's the HTML from a typical page:

<p data-elastic-name="type">recipe</p>
<p data-elastic-name="cookingTime">30 minutes</p>
<p data-elastic-name="category">Main Course_</p>
<p data-elastic-name="category">Lunch Recipes</p>

It's cookingTIme that never seems to get picked up. Type above works fine and category beneath nicely creates an array field.

I have access to our crawler logs via Kibana, but I haven't looked deeply into them beyond confirming that the crawl is working. If

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.