Hourly Time based indices reducing shard segments

We are using fluentd, elastic, and kibana for logging aggregation. The amount we log has steadily grown so we've switched to hourly time based indices to try and keep our indice size under 50GB as when we did daily indices that often occurred. Search performance as you increase the number of indices search across several days obviously grows and search performance becomes much worse now if we search for something over the past 7 days in kibana and often times out at our 60 second search limit in kibana. Beyond just increasing the timeout I was looking into what we could do to optimize these indices that are no longer being written to and I came across force merging shard segments. For these time based indices it looks like from a random sampling we have at times around 100 segements for a single hourly indice. What would the recommendation before for optimizing these segments? Since they are time based unless we have some networking issue after the hour has passed these indices should be considered static and read-only at that point. Is it truely one segment? The max indice size I've seen now at hourly indices is around 4GB, and at low volume times it can just be a couple hundred MB. How does one properly size segment count for these only indices?

I would recommend that you have a look at the rollover API. This allows you to create and switch to a new index when necessary, depending on document count and/or age, rather than just time as in your current setup. Using this you can generate more indices under heavy load and fewer when data volumes drop, which will result in fewer shards of much more similar size.

Forcemerge can as you suggest be very useful for older, read-only indices, but is a quite expensive operation. I am however not sure what the optimal number of segments to target is.

Aren't searches across a shard though serialized where each indice can be searched in parallel?

They are, but searching lots of small indices can be slower that searching fewer larger ones as there are fewer tasks that need to be queued up and performed.

What do you consider small indexes? Our hourly indices are normally 4GB.

A little more about the cluster.

We use it for logging. It has nonprod and prod indice prefix with each of them getting hourly new indices. These are usually around 4GB a piece. From casual looking average is probably 1.2 million documents or so but we do have higher peaks.

Any other settings or resources you can recommend for configuration? Right now we have a six node cluster without any dedicated masters. Our search performance is pretty poor though and can take a couple minutes depending on the search.

You have to test but one single shard can be 20-50gb.

1 Like

As David pointed out, a good target shard size (not index size) for many logging use cases is often measured in tens of GB. If your largest hourly index is 4 GB (max 96GB), a daily index with 6 primary shards (given that you have 6 nodes) might therefore be sufficient.

How about index refresh interval then? I get if it’s the default 1 segment it creates a lot of segments. Is every two minutes to high? Should it be longer or shorter?

That depends on how long you are willing to wait before the data becomes searchable.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.