Elastic crawler metadata content extraction

Hello!

Is there a way to extract html metadata fields with the elastic crawler without setting the class=“elastic” on them? We have inherited a large flat file html site we would like to index and it would take significant time to add this class to all files.

Is this something that can be done using the Web crawler content extraction rules?

Thanks

  • Imran

Yes, this can be done with the Elastic Web Crawler's Content Extraction Rules. See Web crawler content extraction rules | Enterprise Search documentation [8.15] | Elastic (note that this feature is not available for the App Search Web Crawler).

You can use either XPATH or CSS selectors to identify the element you'd like to extract to its own field.

Thanks Sean, I appreciate the fast response

1 Like