erikthered  
                (Erik Redding)
               
                 
              
                  
                    February 10, 2016,  4:20pm
                   
                   
              1 
               
             
            
              Continuing the discussion from _forcemerge query not working? :
Hello,
I have an index that was around 460GB.  I performed a delete by query on said index to remove ~3/4 of all the documents from the index. I understand (after reading lot's of documentation) about how Lucene just flags these documents for indexing, and later performs deletes upon merging it's segments.
The problem is that Lucene only slightly merged my segments after this delete.  The cluster size, after the merge,  barely decreased.  I was expecting ~2-300GB of space to be freed.
I tried using the forcemerge (then even the deprecated _optimize) query, specifying the maxnum_segments to 1, then specifying the only_expunge_deletes parameters, these don't seem to have any effect at all (perhaps they don't override the elastic Found cluster settings?)
running _cat/segments shows, for this index, ~100 segments like this:
index           0 p IP_ADDRESS _55g    6676  259526  492417   10.7gb 15986248 true  true  5.3.1 false 
index           0 p IP_ADDRESS _dlk   17624  300293  393906   10.5gb 17363328 true  true  5.3.1 false 
index           0 p IP_ADDRESS _do3   17715  304912  228060    8.3gb 16702931 true  true  5.3.1 false 
index           0 p IP_ADDRESS _etl   19209  234812  212975    6.2gb 14415090 true  true  5.3.1 false 
index           0 p IP_ADDRESS _rph   35909  243908  597468   11.6gb 16228232 true  true  5.3.1 false 
index           0 p IP_ADDRESS _xyq   44018  211087  600916   10.6gb 14982992 true  true  5.3.1 false 
index           0 p IP_ADDRESS _10f1  47197  150588  563340    7.9gb 12553617 true  true  5.3.1 true 
index           0 p IP_ADDRESS _10m5  47453  245789  347914    8.2gb 14972958 true  true  5.3.1 true 
index           0 p IP_ADDRESS _5ujm 272866  182584 1105416   15.5gb 15480243 true  true  5.3.1 true 
index           0 p IP_ADDRESS _5ujq 272870  175804  820951   11.5gb 14322993 true  true  5.3.1 true 
index           0 p IP_ADDRESS _5ujw 272876    6557 1411516    6.5gb  7773895 true  true  5.3.1 true 
index           0 p IP_ADDRESS _5uye 273398    6534  791621    4.3gb  6949108 true  true  5.3.1 true 
index           0 p IP_ADDRESS _5vtv 274531   12933  811501    6.1gb 10845022 true  true  5.3.1 true...
Shouldn't these segments be triggering a merge automatically? Some have a deleted % of 99.5%
What's preventing Lucene from cleaning up these segments for me?
Cluster ID: 3daac0
 
 
This is a very specific elasticsearch-centric question; I've reposted this in the appropriate topic.
The issue in question is around Elasticsearch 2.1.1 .
             
            
               
               
               
            
            
           
          
            
              
                mikemccand  
                (Michael McCandless)
               
              
                  
                    March 1, 2016,  2:49pm
                   
                   
              2 
               
             
            
              Hmm there was this bug in Lucene:
https://issues.apache.org/jira/browse/LUCENE-6166 
where Lucene failed to trigger merges if all you did was delete and never add any documents, but it was fixed in Lucene 5.3.0 and should be included in ES 2.1.1, so you should have seen merges running without having asked for forceMerge.  Have you added any documents since doing the deletions?
Are you sure there are no merges running?  They will take quite some time to complete.  What refresh_interval do you have?
Separately, forceMerge definitely should have done something.
Can you post any settings you've changed from defaults?