I have a bit of dirty data in my index.
While for all regular documents, the 'id' field should be a domainname
(example.com, blog.example.com, ...), there was a bit of bad code which
inserted autogenerated values ('rM8CDN-aTC6lxaPIc858Rg',
Now I'd like to write a little cleanup script that deletes these bad
My problem is that I have about 100 million documents, so just iterating
over all of them would take ages.
Something that I'd love to be able to do: Filter for id fields that don't
have a dot in them. Would that need a wildcard query with a trailing and
Alternatively I could probably also filter for id fields that are 22
My problem is that I haven't figured out how to do either of those two
Any recommendations on how to clean this up?