Using Elasticsearch 5.6
Recently we did some cleanup on one of our indexes and removed 300 million docs from a total of about 2 billion. While removing the docs using delete_by_query
our index memory exploded along with the segments. The index size went from 2 TB to 3.7TB.
The increase in segments was not surprising but the increase in the size was. Once we killed the delete_by_query task the segments merged down and reclaimed about 1/2 TB of memory. The number of deleted docs returned to normal but the index was still more than a TB bigger than before.
This morning the index returned to its normal size due to a large drop in the segments.fixed_bitset_memory across all of the nodes. Over the course of a min, we flushed a TB worth of data.
QUESTIONS:
From what I have read
Memory used by fixed bit sets for nested object field types and type filters for types referred in join fields
- Only 20k of the docs deleted had nested fields present in them only one level deep. BUT we do have a parent/child relationship because we are on ES 5.x, is that considered a join and would a fixed bit set be used to represent that? Any further explanation on fixed bit sets would be greatly appreciated!
- Is there any way to control flushing these fixed bit sets when doing large deletions?
Thanks in advance for any help!