How to identify and delete invalid child IDs

kgeographer · December 19, 2022, 2:57am

I have an index making extensive use of Parent/Child relations, and over time some Child docs have been deleted without removing reference to them in their Parent. This doesn't impact searches but I do use a count of Children in ordering, so I need to prune these 'zombie' references to non-existing Children from Parents.

I can imagine a brute force approach retrieving each Parent having Children, then getting all Child docs and doing a pythonic set comparison of ids, but is there a more efficient approach?

Christian_Dahlqvist · December 19, 2022, 5:52am

Typically parent-child relationships do not have references to children in the parent as far as I know, so is this something you have added and maitain?

kgeographer · December 19, 2022, 12:26pm

Thanks for the reminder! Built this 5 years ago.

Yes, I have added a children array field, and each time a child is assigned an exiting parent, its _id gets added to the parent's children[]. There's also a searchy[] array, and certain tag terms of the child get added to that.

This seemed at the time (and now) convoluted, but it was the only way I could think of to meet my requirement: searching for a tag and returning one or more parent/child 'clusters'. That is all of the parent+children sets that have a given tag in any of their members, whether parent or child. So the search for tags is limited to parents, and returns the parent and its child _ids. It approximates a graph in a way.

Maintenance is proving to be a struggle, because over time documents can get updated or removed by a web app, meaning surgery: removing a parent that has children requires transferring the parent role to one of them. Removing a child requires removing its _id from the parents children[] field...not to mention maintenance of the searchy[] field.

Seems I've dug myself a hole - suggestions are welcome!

Christian_Dahlqvist · December 19, 2022, 12:29pm

I can not see any easy or efficient other way to get around this.

kgeographer · December 19, 2022, 12:36pm

Thanks. I plan to investigate redesigning this architecture, as the nature of the data is in fact graph-like. That research is in front of me: Graph: Explore Connections in Elasticsearch Data | Elastic

system · January 16, 2023, 12:37pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Delete child document without parent Elasticsearch	6	877	June 27, 2022
Deleting a parent document with no Child Documents Elasticsearch	3	1023	July 6, 2017
How to search parent document without children? Elasticsearch	5	1249	February 13, 2021
How do i delete child docs in my parent-child relationship automatically in elasticsearch after a particular interval is finished Elasticsearch	13	1652	September 30, 2019
Delete child documents without parent using Java API Elasticsearch	1	342	September 22, 2020

How to identify and delete invalid child IDs

Related Topics