Our malware detection found a malware (Exploit:O97M/DDEDownloader.C) in a index file system file where we store our indices:
file:D:\ESData\Data\nodes\0\indices\w9l0szt8RR-ZeQ8p4asJqA\3\index_108v_Lucene70_0.dvd
Have identified the index, but it contains 15 million documents.
And i only store windows event logs in elasticsearch.
I would like to investigate this further, any tips?
Most virus/malware detection works on some kind of heuristics or signature. That is, they see some particular string of bytes in any file that matches or hashes to some known virus and it flags it as that virus, even if it doesn't know for certain that there's a virus there. The problem is that many systems like Elasticsearch generate strings of bytes that occasionally, just by sheer probability (a low probability but across a very high volume of data) also match the malware database. That doesn't mean it contains a virus but instead contains a set of bytes that have the same hash or look similar in some part that the malware detector uses to identify malware. I have seen many search indices and database files over the years be incorrectly identified as containing a virus. I've seen many unfortunate cases where the malware scanner decided to delete the file, thus deleting an index.
As a result, it's fairly common to add the node data directories to the exclusion list of virus scanners. I'm not particularly familiar with this virus, but if you want to be extra careful, you could check to see whether it's something that's known to live somehow in Lucene doc-values files (I'm not aware of any viruses that are) and whether the machine in question has had adequate controls to prevent however that virus propagates. You may also consider locking down the data directories so that only Elasticsearch can write to/read from them for some added protection if you haven't done so already.
Hello, Thanks for an exhaustive answer, i thought like that to. But felt uncertain. In will exclude the directories. Is there a list of official file extensions that i can use?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.