Windows Defrag cause node to become unresponsive?

I have been trying to track down an issue where we have nodes become unresponsive momentarily, causing the cluster to think they have left. I have assumed this to be a GC issue and have been trying to optimize to that end. I added a bunch of telemetry and while investigating I noticed something. I had five nodes that had network blips overnight. During this time I did not see any excessive heap use or GC. I was looking through the event viewer on the nodes and happened to notice the same error on all five nodes. It was:

Log Name: Application
Source: Microsoft-Windows-Defrag
Date: 2/17/2017 4:27:41 AM
Event ID: 257
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: ocv-es-17
Description:
The volume (C:) was not optimized because an error was encountered: Neither Slab Consolidation nor Slab Analysis will run if slabs are less than 8 MB. (0x8900002D)

It would seem to be unrelated, but on all four nodes, this error preceded the node dropping off by a 3 to 5 minutes:

Node	Down	Up	Window	Defrag	Leadtime
es-17	4:32:36	4:33:40	0:01:04	4:27:00	0:05:36
es-12	4:42:04	4:47:16	0:05:12	4:35:00	0:07:04
es-11	4:53:21	4:55:51	0:02:30	4:51:00	0:02:21
es-13	5:11:16	5:16:21	0:05:05	5:08:00	0:03:16
es-07	1:45:56	1:46:42	0:00:46	1:42:35	0:03:21

I don't see any other instances of the Defrag error this week.

Is there any way that Defrag could be causing problems with Elasticsearch?

Thanks,
~john

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.