Elasticsearch Segment Size

I am currently building a large Elasticsearch cluster that needs to be able to eventually handle 1,000,000 index requests per second. I am currently scaling to that point (at about 250k/s) but I are being held back by a large amount of segment merges. When I fire up the data stream, the cluster runs well for about 5 minutes then I start to see throttling due to a large amount of merges.

I want to try to force Elasticsearch to create large initial segments so that it doesn't waste time merging small segments. We have a lot of RAM to work with (32GB for ES) so we can afford to build large segments in-memory.

I thought I could achieve this through the following settings:

indices.memory.index_buffer_size: 30%
index.translog.flush_threshold_size: 5g
index.refresh.interval: 60s

The idea is to allocate a large amount of memory to the Index Writer while setting the refresh interval and flush threshold to be large enough that segments don't get committed very often.

Unfortunately I am still seeing many small segments getting created, resulting in throttling and overall poor performance.

Currently I have 4 active Indices for a total of only 8 shards per node.

Here are a few snapshots of what I'm seeing from marvel.

As you can see my Index Writer Memory is not even close to being fully utilized and my segment count is fairly high.

Any input on what I might be doing wrong or how I can achieve my desired behavior would be greatly appreciated.

Thanks,
Harlin

If your heap is 32G you should reduce that to 30.5G.

Have you looked at https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-merge.html?

We have the heap set at 30g currently, and yes I have read through the documentation.

Generally for this sort of thing we recommend people use hot and cold nodes, hot nodes that deal with the indexing and merging on SSDs, and then cold nodes where fully merged and optimised indices with no more active writes can live.

agreed

indices.memory.index_buffer_size: 30%
index.translog.flush_threshold_size: 5g
index.refresh.interval: 60s

we are using previous es version, our settinngs are almost same, with the difference only to flush_threshold_ops to 50k. we have more than 26 shards per node but less than 1000 segment count but with less heap per node. 14GB. with indexing speed still at reasonable speed. doing around 100++ indexing requests per node per second.

just out of curiousity, what jvm you are using and how many nodes you have in your cluster?

agreed

sorry for the earlier posts, they just got mis-typed. @Harlin_ES, are u enabling search also in the indices same time the index is happening? If not, why don't you just reset the refresh time to -1?

@Jason_Wee

We didn't set flush_threshold_ops since it defaults to unlimited. We are using Java 8 on a total of 7 nodes (including one dedicated master).

What does your Index Writer Memory look like? Is it utilizing the memory allocated to it?

@mosiddi

Unfortunately yes we have to be able to search and index simultaneously so we can't set the refresh interval to -1.

I see. How much ur new segment sizes look like?

@mosiddi

Judging by the utilization of our Index Writer memory it looks like we are writing 16-64MB segments originally. We want to try to get this number up to at least 512MB. I feel this should be possible since we are only indexing into 1-2 indices at a time and we have ~10GB per node allocated to the index writer currently.

Each refresh cycle (which is 60s in ur case) wud generate a new segment. Wondering if the indexing load (# of documents per second * Average size * 60) is ending up only to 16 - 64 MB?

@mosiddi

So we are indexing at 250k/s currently at about 250bytes/doc.
This means that every 60s we are indexing roughly 3.75GB of Data. We should be seeing far larger segments but something is causing Elasticsearch to commit segments a lot more often than we want.

Index buffer size is shared across all shards and segments are created @shard level. So for 3.75GB, the spread of load would be 3.75 / (4 indices * No. of shards per index) = ? Some shards will be less loaded and will get smaller segments while others will get bigger segments.

We are indexing by time period so only one index is active at a time. We have 6 primaries per index with one replica currently, so a total of only 12 shards.

I dug a little further into the segments using the Index API and saw that Elasticsearch is creating a lot of segments in memory before any refreshes or flushes. For example within 30 seconds of running my indexer I had >200 segments in my index, none of which had been committed. Some of them were as small as 1MB.

I am guessing this has something to do with my indexing strategy, I am using a bulk size of 2500. Will Elasticsearch create a new segment for each bulk request?

Just to give an idea of some of the segments we are seeing:

"5" : [ {
          "routing" : {
            "state" : "STARTED",
            "primary" : true,
            "node" : "X77hBhkKQKe8qBPK3d6_GQ"
          },
          "num_committed_segments" : 0,
          "num_search_segments" : 10,
          "segments" : {
            "_k" : {
              "generation" : 20,
              "num_docs" : 56167,
              "deleted_docs" : 0,
              "size_in_bytes" : 12300173,
              "memory_in_bytes" : 99154,
              "committed" : false,
              "search" : true,
              "version" : "4.10.4",
              "compound" : false
            },
            "_1g" : {
              "generation" : 52,
              "num_docs" : 99560,
              "deleted_docs" : 0,
              "size_in_bytes" : 21256102,
              "memory_in_bytes" : 136938,
              "committed" : false,
              "search" : true,
              "version" : "4.10.4",
              "compound" : false
            },
            "_2b" : {
              "generation" : 83,
              "num_docs" : 89181,
              "deleted_docs" : 0,
              "size_in_bytes" : 19086181,
              "memory_in_bytes" : 135218,
              "committed" : false,
              "search" : true,
              "version" : "4.10.4",
              "compound" : false
            },
            "_36" : {
              "generation" : 114,
              "num_docs" : 98662,
              "deleted_docs" : 0,
              "size_in_bytes" : 21086454,
              "memory_in_bytes" : 132618,
              "committed" : false,
              "search" : true,
              "version" : "4.10.4",
              "compound" : false
            },
            "_3g" : {
              "generation" : 124,
              "num_docs" : 30367,
              "deleted_docs" : 0,
              "size_in_bytes" : 6742786,
              "memory_in_bytes" : 62402,
              "committed" : false,
              "search" : true,
              "version" : "4.10.4",
              "compound" : true
            },
            "_3r" : {
              "generation" : 135,
              "num_docs" : 37202,
              "deleted_docs" : 0,
              "size_in_bytes" : 8305559,
              "memory_in_bytes" : 76914,
              "committed" : false,
              "search" : true,
              "version" : "4.10.4",
              "compound" : true
            },
            "_3s" : {
              "generation" : 136,
              "num_docs" : 298,
              "deleted_docs" : 0,
              "size_in_bytes" : 88212,
              "memory_in_bytes" : 6306,
              "committed" : false,
              "search" : true,
              "version" : "4.10.4",
              "compound" : true
            },
            "_3t" : {
              "generation" : 137,
              "num_docs" : 5840,
              "deleted_docs" : 0,
              "size_in_bytes" : 1393138,
              "memory_in_bytes" : 17506,
              "committed" : false,
              "search" : true,
              "version" : "4.10.4",
              "compound" : true
            },
            "_3u" : {
              "generation" : 138,
              "num_docs" : 3657,
              "deleted_docs" : 0,
              "size_in_bytes" : 889948,
              "memory_in_bytes" : 13274,
              "committed" : false,
              "search" : true,
              "version" : "4.10.4",
              "compound" : true
            },
            "_3v" : {
              "generation" : 139,
              "num_docs" : 85,
              "deleted_docs" : 0,
              "size_in_bytes" : 25803,
              "memory_in_bytes" : 5714,
              "committed" : false,
              "search" : true,
              "version" : "4.10.4",
              "compound" : true
            }
          }
        },

Notice that 1: None of these segments have been committed and 2: some of the segments are very small (as little as < 1MB

This is after indexing for roughly 20s into a new index

1 Like

Hi @Harlin_ES, So what was the end of this story? Did you managed to find root cause of small segment problem and finally got bigger segment?
Thanks for your sharing.

Hi @Harlin_ES, So what was the end of this story?
Thanks for your sharing.

1 Like