We have deployed Elasticsearch 7.6.1 onto a Kubernetes cluster for log monitoring. Logs are collected using Fluent Bit and sent to Elasticsearch. Everything seems to be working fine. But yesterday, one of the admins contacted me to ask about a large number (700,000+ in 10 hours) of (org.elasticsearch.index.IndexNotFoundException and sun.nio.fs.UnixException) exceptions being detected by Dynatrace. Looking into it, these exception happen all the time. I suspect/assume they are triggered with every "chunk" sent from Fluent Bit.
I confirmed that Elasticsearch appears to be working fine and log messages are flowing in as expected. And there are no ERROR (or WARNING) messages in the Elasticsearch log file indicating this is a problem. But I'm being asked to clarify why the exceptions are occurring.
The IndexNotFoundException exceptions mention a specific index name that I know is the index name Fluent Bit is sending documents to. But we have an ingest pipeline that intercepts the incoming load and redirects them to different indexes based on fields within the message. So, no documents are every being written to the dummy index and it is never created. But I'm confused why Elasticsearch is attempting to verify it exists and since there seems to be tight correlation between the Elasticsearch exception and the file system exception, why it is (apparently) making calls to the file system to check for the index.
I went ahead and created the "dummy" index this morning and assumed that this would eliminate these exceptions. But, strangely, while the number of Elasticsearch exceptions was significantly reduced (by about 75%) they weren't eliminated completely. Thinking about it now, I just realized I never added any documents to this dummy index...so maybe the index metadata exists but no index files exist on disk at this point.
In any case, can anyone clarify why these exceptions are being thrown when nothing is actually being sent to this "dummy" index?
Thanks!