We have a dataset with 110 million documents, 20 million of which have a reading of "0". We suspect some hourly readings have been lost due to totals running 10% to 30% lower than expected and unfortunately it seems they've been recorded as "0" instead of null.
One thought I had was to detect all "0" readings and then detect if the reading before and after are non-zero. If so, average them to replace the "0" reading. I'm ok either running an in-place replacement or outputting to another index.
Has anyone done something similar to this and if so, any tips or suggestions on how to accomplish this would be appreciated.