Hot-warm 'move index' does not delete index data from original node


(Currycoder) #1

Hello,

I'm experimenting with introducing a 'warm' data node as explained here: https://www.elastic.co/blog/hot-warm-architecture

I've allocated my indices to my 'hot' node and have attempted to move a single index to my 'warm' node, which appears to have been successful. I moved an index called analytics-pageview-2016-08-30 - here's the pertinent output from /cat/_shards:

analytics-pageview-2016-09-20 1 p STARTED     967053 297.9mb 10.196.3.254 Ocean        
analytics-pageview-2016-09-20 1 r UNASSIGNED                                           
analytics-pageview-2016-09-20 0 p STARTED     966107 298.1mb 10.196.3.254 Ocean        
analytics-pageview-2016-09-20 0 r UNASSIGNED                                           
analytics-pageview-2016-08-30 1 p STARTED     841559 256.9mb 10.1.0.61    Lady Octopus 
analytics-pageview-2016-08-30 1 r UNASSIGNED                                           
analytics-pageview-2016-08-30 0 p STARTED     842086 255.9mb 10.1.0.61    Lady Octopus 
analytics-pageview-2016-08-30 0 r UNASSIGNED                                           
analytics-pageview-2016-08-31 1 p STARTED     920134 279.2mb 10.196.3.254 Ocean        
analytics-pageview-2016-08-31 1 r UNASSIGNED                                           
analytics-pageview-2016-08-31 0 p STARTED     919752 284.1mb 10.196.3.254 Ocean        
analytics-pageview-2016-08-31 0 r UNASSIGNED                                  

However on my 'hot' node, I can still see that index's data:

$ du -sh /var/lib/elasticsearch/elasticsearch/nodes/0/indices/analytics-pageview-2016-08-30
514M	/var/lib/elasticsearch/elasticsearch/nodes/0/indices/analytics-pageview-2016-08-30

Will this eventually be cleaned up by elasticsearch following some criteria being satisfied?

I'd greatly appreciate any advice you could offer.


(Yannick Welsch) #2

yes, the criteria is that there are enough shard copies in the cluster. You specified that shards of this index must have 1 replica. In your current cluster, only the primary is allocated. As long as no replica is allocated, the cluster will keep the extra copy of the data around.


(Currycoder) #3

Ah - that makes sense. Thanks for taking the time to answer.


(system) #4