What does _forcemerge?max_num_segments=1 do if the segment would be > 500GB limit?

Does a _forcemerge?max_num_segments=1 request fail when the segment reaches 500GB. Or would the request create a second segment?

We punt on recovery time in favor of 100GB shards for better search performance. This is a time-series one-index per day use case. We want to reduce the number of segments when the index no longer needs to accept new documents. Our cluster has 6 data nodes; 6 pri, 1 repl per daily index. It indexes about 900M documents (300GB) per day. We are getting gateway errors from Kibana with 7-day search.

The timeout is likely because you have too big of shards. There is too much data for a single node to comb through in a timely fashion. You are correct in assuming that a smaller segment count will help with this, but it won't eliminate the problem.

There are a few ways to solve this. One would be to use the _rollover API to keep your indices from growing bigger than n documents per index. You would also need to separately alias the indices at or around the same time as a rollover to keep a "today" alias which would point to all of the indices with some of today's data.

Another solution is to add more nodes and have a higher number of shards per index, to spread the data around more.

Yet another solution is to not use a proxy or load-balancer (I'm assuming your gateway error is a 504, which usually means an LB or Proxy timed out), or to drastically increase the timeout on it. If you don't mind slower queries, this solution is also viable.

There are likely other solutions as well. Those are just the three off the top of my head.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.