Logstash doesn't do anything "magical" with its indices and the behavior of adding/using time-based indices is easily done.
What most people do is set up your indices to be on the same granularity of your delete batch, because Elasticsearch is much more efficient at deleting whole indices than trying to do something like deleting by query. So if you plan to delete 1 hour at a time, then you may set indices that encompass 1 hour of data. That is, when you index a document into Elasticsearch, you'd do something like
POST /myindexprefix-2018-02-16.0700/doc
{
"field1": "value1",
...
}
Then, once myindexprefix-2018-02-16.0700
is the maximum age, you delete the whole index which should be a very fast operation.
You can also query all of the indices at once by doing something like the following, given the previous example
GET /myindexprefix-*/_search
{
...
}
Elasticsearch will then resolve the *
into all the indices matching the given pattern.
You can also get fancier if you want. Elasticsearch supports date math in the URL for the index names, so you can just use now
, round it down to the nearest hour, and use that automagically in the index name: just use that linked doc for examples.
I will say that with hourly indices, you're going to want to be careful about total shard count. With hourly indices, you'd be creating 24 * (number of shards per index) * (number of unique prefixes) * (number of days of retention)
shards. With hourly indices just a few different prefixes and a few weeks/months of retention, that number can get very large and very large shard counts can cause problems. This is one of the reasons why many people choose daily, weekly, or monthly retention (and thus daily/weekly/monthly index names) instead of hourly. As long as you're in a short retention period and not many prefixes and a low shard count, you should be fine, but you may want to reconsider if that changes.