Index throttling issue

Just chiming in to say that if you can avoid TTL, you'll greatly reduce your merge pressure.

TTL works by (essentially) running a query every 60s and finding all docs that have expired, then executing individual deletes against those documents. These deleted docs linger in your segments until Lucene's merge scheduler decides to merge them out.

Basically, TTL pokes a lot of little holes in all of your segments, which causes the merge scheduler to constantly be cleaning up all the half-filled segments. Which ultimately means you are moving a lot of data around the disk all the time.

If, instead, you can structure your indices using a time-based approach (e.g. index-per-day), you can simply delete the entire index. This is equivalent to deleting a directory off the disk, and doesn't require any expensive merging.

Usually the time-based index doesn't provide a fine enough granularity for your application, so you'll likely want to include an expire_time field in the document and a corresponding range filter in your query, to make sure docs are no longer served after the 24hr period (but before the index is deleted).