I finally got around to documenting Apache Tika's MockParser[1]. As of Tika 1.15 (unreleased), add tika-core-tests.jar to your class path, and you can simulate:
- Regular catchable exceptions
- OOMs
- Permanent hangs
This will allow you to determine if your ingest framework is robust against these issues.
As always, we fix Tika when we can, but if history is any indicator, you'll want to make sure your ingest code can handle these issues if you are handling millions/billions of files from the wild.
Cheers,
Tim