How to delete documents by a certain field?

Hello,

I have an App Search engine and I need to delete documents in it by a certain field value. I receive JSON files that look something like this:

{
  "entries": [
    {
      "id": "park_rocky-mountain",
      "category": "national-parks",
      "title": "Rocky Mountain",
      "nps_link": "https://www.nps.gov/romo/index.htm",
      "states": [
        "Colorado"
      ],
      "visitors": 4517585,
      "world_heritage_site": false,
      "location": "40.4,-105.58",
      "acres": 265795.2,
      "date_established": "1915-01-26T06:00:00Z"
    },
    {
      "id": "park_saguaro",
      "category": "national-parks",
      "title": "Saguaro",
      "nps_link": "https://www.nps.gov/sagu/index.htm",
      "states": [
        "Arizona"
      ],
      "visitors": 820426,
      "world_heritage_site": false,
      "location": "32.25,-110.5",
      "acres": 91715.72,
      "date_established": "1994-10-14T05:00:00Z"
    }
  ]
}

And in my case they may contain between 10 to 20 entries, but the categories are different for each JSON file. For example they could be theme parks instead.

Occasionally some entries are removed from a JSON file (but I don't know which ones) so I need to delete all entries of a certain category and and reindex the JSON file into the engine.

I'm using the Python app search client to index my documents.

What is the recommended way to delete documents by a certain field?

I'd do this in two steps:

  1. query for existing documents that match the category using a value filter.
  2. issue a delete request referencing the IDs from the previous step's results.

You could also take an approach where each category goes into its own engine, and you you could delete the whole engine and re-create from scratch each time. But this might be overkill for your use case.

1 Like

Thank-you very for your timely and helpful response!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.