Is reindexing much heavier than initial indexing?

I have single server with ES 9.2.5.

It has 40cores/256GB RAM/12TB of SSD.
Thers are 2 aliases, ILM is rotating indices at 200GB (10 shards per index).

Everything works fine in terms of indexing/search speed, but I cannot reindex larger index (serious change to mapping is required).

I stop all activity on ES while reindexing, but allowed downtime is 2-3 days at max. So parallel reindexing is crucial.

“Smaller” alias (docs size is between 1-200kb and total size of 1TB) with underlying indices was successfully reindexed in 6 threads.

ES slicing was not really useful, manually starting 6 scripts was saturating hardware much better.

But “Larger” alias (docs size is between 1-400MB (yes, some docs are large) and total size of 10TB) fails at concurrency of 2-3-4.

I split reindexing of every index to three:

1 sub-500KB docs with batch size 200

2 500-5000KB - batch size 8
3 5000KB plus - batch size 1

technically any of reindex jobs shouldn’t consume more than 1-3GB of RAM.

But jobs fail rather fast with circuit breaker errors.

When I run 3 reindex jobs (one index at a time, all 3 categories) monitoring breaker.parent shows very high estimated_size.

I tried to set ES heap to 31GB, 165GB and even 240GB.

limit_size gets much more room, but still I couldn’t reliably run even 4x3 reindex jobs.

Any hints?

This option didn’t visibly help:
indexing_pressure.memory.limit: 20%

My next option is: moving data to cluster, but I really do not see any reliable proofs, that it will solve the problem.

My script:

#!/bin/bash

if [ -z "$1" ]; then
  echo "Usage: $0 <index_suffix_number>"
  exit 1
fi

# 2-digit zero padding
SUFFIX=$(printf "%02d" "$1")

SRC_INDEX="newattach-0000${SUFFIX}"
DST_INDEX="attach-0000${SUFFIX}"

KB_500=512000        # 500 KB
MB_5=5242880         # 5 MB

############################################
# Job 1: size < 500 KB
############################################
curl --noproxy '*' -u adm:HAHA -X POST \
"localhost:9200/_reindex?refresh&scroll=8h&wait_for_completion=false&pretty" \
-H "Content-Type: application/json" -d @- <<EOF
{
  "source": {
    "index": "${SRC_INDEX}",
    "size": 400,
    "query": {
      "range": {
        "size": {
          "lt": ${KB_500}
        }
      }
    }
  },
  "dest": {
    "index": "${DST_INDEX}"
  }
}
EOF

############################################
# Job 2: 500 KB <= size < 5 MB
############################################
curl --noproxy '*' -u adm:HAHA -X POST \
"localhost:9200/_reindex?refresh&scroll=8h&wait_for_completion=false&pretty" \
-H "Content-Type: application/json" -d @- <<EOF
{
  "source": {
    "index": "${SRC_INDEX}",
    "size": 8,
    "query": {
      "range": {
        "size": {
          "gte": ${KB_500},
          "lt": ${MB_5}
        }
      }
    }
  },
  "dest": {
    "index": "${DST_INDEX}"
  }
}
EOF

############################################
# Job 3: size >= 5 MB
############################################
curl --noproxy '*' -u adm:HAHA -X POST \
"localhost:9200/_reindex?refresh&scroll=8h&wait_for_completion=false&pretty" \
-H "Content-Type: application/json" -d @- <<EOF
{
  "source": {
    "index": "${SRC_INDEX}",
    "size": 1,
    "query": {
      "range": {
        "size": {
          "gte": ${MB_5}
        }
      }
    }
  },
  "dest": {
    "index": "${DST_INDEX}"
  }
}
EOF