Custom API Source, Delete All Documents

TonyWRobinson · September 16, 2021, 7:21pm

The API documented here: Custom sources indexing API reference | Workplace Search Guide [8.4] | Elastic

...seems pretty straightforward.

But I've opened a case with support because the response I get when attempting to use it (with the access token associated with my Custom API Source in Workplace Search) is:

"error": "Routing Error. The path you have requested is invalid."

I'm curious if anyone else has used the various API's for creating (bulk_create) or updating or removing documents in Custom API Sources?

We use these Custom API Sources extensively. We typically have a logstash pipeline doing the work to pull the data from our custom source and index it in workplace search. In this case, the pipeline uses a jdbc input. And the query for the jdbc input has changed -- not the schema but the number of results it produces. So what I want to do is "purge" the existing documents in the Custom API Source indexes and re-fill them with the proper subset of documents. This way I don't have to completely remove the Custom API source and re-add it, having to, in turn, re-do the Display Settings.

Anyone else ever tried to do such a purge without entirely removing and re-adding the Custom API source?

Sean_Story · September 16, 2021, 9:09pm

Hey @TonyWRobinson !

Good questions. I just double checked on 7.14.1, and did not see the issue you mention:

Seans-MBP-2:ent-search seanstory$ TOKEN=REDACTED
Seans-MBP-2:ent-search seanstory$ ID=6143ae8ec19fb6dbccc280f8
Seans-MBP-2:ent-search seanstory$  curl -X DELETE http://localhost:3002/api/ws/v1/sources/${ID}/documents \
> -H "Authorization: Bearer ${TOKEN}" | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    39    0    39    0     0    113      0 --:--:-- --:--:-- --:--:--   113
{
  "total": 59,
  "deleted": 59,
  "failures": []
}

Given your error message of "error": "Routing Error. The path you have requested is invalid.", you may have typo'd your endpoint or content source ID.

Can you share:

your cURL command that demonstrates the issue?
What version of Enterprise Search you're running?
The result of running the Get a content source by ID API for that same content source?

what I want to do is "purge" the existing documents in the Custom API Source indexes and re-fill them with the proper subset of documents. This way I don't have to completely remove the Custom API source and re-add it, having to, in turn, re-do the Display Settings.

I'm glad to hear that this is what you're using the delete-all-documents API for! This was exactly the usecase we developed it for. It definitely should work as you're expecting it to, so I'd love to help you track down the issue.

This way I don't have to completely remove the Custom API source and re-add it, having to, in turn, re-do the Display Settings.

Actually, you might be interested to note our Create a Content Source API, which allows you to specify your schema and display settings at creation time, and/or our Update a Content Source API, which allows you to modify them after creation time.

TonyWRobinson · September 17, 2021, 2:12pm

Hi, @Sean_Story ! Thank you for your reply.

Here is the cURL command I'm using (against version 7.11.1 of the API):

curl -X DELETE "http://servername.qa.internal.net:3002/api/ws/v1/sources/24CharsICopiedFromWPSUIForThisSource/documents" -H "Authorization: Bearer evenMoreCharactersICopiedFromTheWPSUIForThisSource"
{"error":"Routing Error. The path you have requested is invalid."

And, interestingly, the GET fails in the exact same way:

curl -X GET "http://servername.qa.internal.net:3002/api/ws/v1/sources/24CharsICopiedFromWPSUIForThisSource" -H "Authorization: Bearer evenMoreCharactersICopiedFromTheWPSUIForThisSource"
{"error":"Routing Error. The path you have requested is invalid."

I'm skeptical that I've "fat-fingered" the source ID or the bearer token because these are simply copy and past operations... I've redacted them here, obviously, with fake strings, but I've double checked that clicking the button in the UI to do the copy did not miss any characters.

That said, the one thing that might be a little weird about our setup is that we typically hit the API through an F5 that is secured with ssl (while the server behind the F5 where Enterprise Search is hosted, is NOT secured with ssl)... don't know if that would make any difference, but I thought I'd mention it.

If I make the call hitting the https endpoint instead, I get the same failure message, so it makes me think it's something with WPS that's wigging out.

Look forward to your reply!

Sean_Story · September 17, 2021, 6:55pm

Hi @TonyWRobinson,

Here is the cURL command I'm using (against version 7.11.1 of the API):

Ah! I think this is your issue. The delete-all-documents API was not present in 7.11.x. Be sure to always check the documentation version that matches the version of Enterprise Search that you are using. See: Custom sources indexing API reference | Workplace Search Guide [7.11] | Elastic does not include the delete-all-documents API. If you do not have a record of all your document IDs (to use the delete-by-id API), you may want to upgrade to at least version 7.13 in order to access the delete-all-documents API. See the 7.13.0 Release Notes.

You may also be interested to know that, coming soon in 7.15.0, we hope to release a Delete Documents by Query API, which will allow you to use a process like:

index all documents from your source
documents become stale
make note of the current time. Re-index all documents from your source (updating/overwriting documents that existed before by using consistent IDs)
use the delete-documents-by-query API to remove documents with an updated_at earlier than the time noted in the previous step.

This will ensure that you have next to no "down time" while you refresh your content source. While this feature isn't available quite yet, I figured I'd make you aware of it if you'll be considering an upgrade anyway.

I hope this all helps!

TonyWRobinson · September 17, 2021, 7:38pm

Thanks, @Sean_Story ! It does. I'm ready to upgrade... this is functionality we need. May end up being a two-step process -- go to 7.14 now and .15 when it becomes available.

Good to know the error message was accurate!

TWR

system · October 31, 2022, 2:48am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
WorkPlace Search document bulk create url Elastic Search elastic-workplace-search	5	472	November 9, 2021
How to retrieve all document form a content source of Workplace search Elastic Search elastic-workplace-search	12	2851	November 16, 2021
Workplace Search Reference Custom Index Elastic Search elastic-workplace-search	14	305	February 15, 2024
Create custom source elasticWorkplace Elastic Search elastic-workplace-search	40	2734	October 31, 2022
Connecting Outlook with CUSTOM API SOURCES Elastic Search elastic-workplace-search	4	849	October 31, 2022

Custom API Source, Delete All Documents

Related topics