How to identify and delete invalid child IDs

I have an index making extensive use of Parent/Child relations, and over time some Child docs have been deleted without removing reference to them in their Parent. This doesn't impact searches but I do use a count of Children in ordering, so I need to prune these 'zombie' references to non-existing Children from Parents.

I can imagine a brute force approach retrieving each Parent having Children, then getting all Child docs and doing a pythonic set comparison of ids, but is there a more efficient approach?

Typically parent-child relationships do not have references to children in the parent as far as I know, so is this something you have added and maitain?

Thanks for the reminder! Built this 5 years ago.

Yes, I have added a children array field, and each time a child is assigned an exiting parent, its _id gets added to the parent's children[]. There's also a searchy[] array, and certain tag terms of the child get added to that.

This seemed at the time (and now) convoluted, but it was the only way I could think of to meet my requirement: searching for a tag and returning one or more parent/child 'clusters'. That is all of the parent+children sets that have a given tag in any of their members, whether parent or child. So the search for tags is limited to parents, and returns the parent and its child _ids. It approximates a graph in a way.

Maintenance is proving to be a struggle, because over time documents can get updated or removed by a web app, meaning surgery: removing a parent that has children requires transferring the parent role to one of them. Removing a child requires removing its _id from the parents children[] field...not to mention maintenance of the searchy[] field.

Seems I've dug myself a hole - suggestions are welcome!

I can not see any easy or efficient other way to get around this.

Thanks. I plan to investigate redesigning this architecture, as the nature of the data is in fact graph-like. That research is in front of me: Graph: Explore Connections in Elasticsearch Data | Elastic

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.