Enrich workplace search indices

Laetitia_RICHARD · March 5, 2021, 8:21am

Hello,

I use workplace search to benefit from the connectors, in particular Sharepoint connector. I need to make some post-processing to my documents (cleaning, alignment, enrichment via an external API), so I use Logstash with workplace search's indices in input and I update these indices in output. But I noticed that everyday with each Full synchronization, I lose all my enrichments, during this synchronization in Workplace Search the documents do not seem to be updated but created again. Would you have a solution to keep all my changes (other than using a new index in output) ?

Sean_Story · March 5, 2021, 3:46pm

Hi @Laetitia_RICHARD ,

I applaud your ingenuity of using Logstash to apply enrichment and post-processing to the documents indexed by Workplace Search. However, as you've observed, the full syncs don't just apply "changes", but fully refresh the content source from scratch - overwriting existing data.

You have a few potential options.

You could sync once, then disable content source syncing with: workplace_search.content_source.sync.enabled: false. This would disable all syncs for all your content sources though, and then you wouldn't get any updates. Seems unlikely you'd want that.
You could use Logstash to instead feed into a Custom API Source. You could then make the original sharepoint source non-searchable, so that you don't get duplicate results in your UI.
You could abandon using the out-of-the-box sharepoint connector entirely, and just use the Custom API Source approach, using custom-written extraction and post-processing code. While this option is a lot more work, it will probably be more stable over time, since we don't guarantee the stability of Workplace Search's underlying indexes - they are subject to changes that may not be backwards compatible to your logstash pipeline.

We've talked about adding custom post-processing capabilities, but have not baked that into the solution yet. If you have a support relationship with Elastic, you can file an Enhancement Request for it, to help bump up the priority.

Laetitia_RICHARD · March 5, 2021, 5:18pm

Thank you very much for your answer
For now, as we are in the POC phase, I will use a new elastic index for my output. So my enrichments will be persistent.

system · October 31, 2022, 2:48am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Workplace search vs Elasticsearch Elastic Search elastic-workplace-search	8	1238	August 26, 2020
Workplace Search Reference Custom Index Elastic Search elastic-workplace-search	14	321	February 15, 2024
Get all data from sharepoint to Workplace search Elastic Search docker , elastic-workplace-search	7	1069	October 31, 2022
From Coveo to Elasticsearch Elasticsearch	4	632	March 6, 2021
How to index Sharepoint files? Elasticsearch	7	1099	April 8, 2020

Enrich workplace search indices

Related topics