E-Commerce Search: How to configure content extraction for Product schema?

I signed up for the Elastic Cloud trial. I don't know much about Elastic, but I've worked with website and enterprise search platforms.

I'm attempting a proof of concept for an e-commerce client who needs help with their onsite search and filtering. They use a homegrown solution right now, and we'd benefit from getting a more robust solution in place.

I've set up an index, which I have set to crawl their product XML feed.

Their HTML is a bit messy, so I'd like to use their product schema to pull in product details.

I'd imagine I need to store full HTML since the schema is JSON in the HTML head.

Can anyone point me to more details on how to achieve this? Would it be through in Ingest Pipeline, content extraction?

I've done SEO consulting for many years. Before that, I was UX Designer and built out an enterprise search for a leading medical association website. We began with htDig in the very early days, then Google Search Appliance and Omniture Search&Promote. I love this stuff!

Hi Garrett,

Is it possible to to ingest their product catalogue through other ways than just the crawler? You mentioned that they have a product XML feed. Have you considered building a data pipeline that pulls the XML feed and ingests into Elasticsearch?

Maybe worth considering looking at building a custom connector. Read more on Elastic connectors

Hi Joeseph,

The XML feed is a good idea. The search engines I've built have always been based on a crawl because I'm not a back-end developer, and they were content from a multitude of platforms with varying levels of connectivity. I'll dig into the Elastic connectors documentation. I appreciate your reply!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.