Can Elasticsearch automatically remove data?
I'm using ES 2.3.
We have 26nodes in a cluster with three Master nodes and rest are data nodes.
There will huge writes happening through out the day.
We see in one of the nodes the OLD GC was going for a very long time more than 19Hrs and after it is completed. Applications started to timeout. I see the data from that node has gradually reduced from 1.4T to 8GB. Immediately I stopped the ES in that node.
I have two questions.
Had ES removed automatically considering this as stale data?
Can application point to Master nodes, right now they are pointed to few data nodes. By pointing to Master node they get routed rightly as per the responding node?
It may have reallocated data to nodes not deemed to be having problems, but will not delete data.
Dedicated master nodes should not serve traffic. They should be left to manage the cluster so there is minimal chance they will be suffering from long GC.
It didn't delete the data overall in the cluster. I mean there is no data discrepancy in the cluster. Also I agree that remaining nodes would have populated the data on behalf of this GC issue node. But in due course do you think it will cleanup the GC problem node and try to populate new data? I see the data consumption graph as slowly reduced.
. And I immediately stopped the ES on that node, now it has 8GB data which has dropped from 1.4TB.
Looks like it removes data on stale node. And rebuilds new data on that node. I'm seeing the size is getting added up again to 1.4T.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.