Initializing_shards got struck

For sure i am going to work on that. But is there any work around for this relocating shards ?

That is usually not a problem so I would recommend waiting until they complete.

1 Like

Sure :slight_smile:

Now status turn green and relocating shard become 0 looks fine. But now the problem is one of the node keeps out from tthe cluster and the error was "[2020-10-20 06:12:10,665][WARN ][netty.channel.DefaultChannelPipeline] An exception was thrown by a user handler while handling an exception event ([id: 0xb86af200, /10.46.XX.XX:47774 => /10.46.XX.XX:9300] EXCEPTION: java.lang.OutOfMemoryError: Java heap space)"

Physical RAM size 32GB

Elasticsearch logs directory

#LOG_DIR=/var/log/elasticsearch

Elasticsearch PID directory

#PID_DIR=/var/run/elasticsearch

Heap size defaults to 256m min, 1g max

Set ES_HEAP_SIZE to 50% of available RAM, but no more than 31g

ES_HEAP_SIZE=16g

Heap new generation

#ES_HEAP_NEWSIZE=

Maximum direct memory

#ES_DIRECT_SIZE=

Additional Java OPTS

#ES_JAVA_OPTS=

Configure restart on package upgrade (true, every other setting will lead to not restarting)

#ES_RESTART_ON_UPGRADE=true

Path to the GC log file

#ES_GC_LOG_FILE=/var/log/elasticsearch/gc.log

################################

Elasticsearch service

################################

SysV init.d

When executing the init script, this user will be used to run the elasticsearch service.

The default value is 'elasticsearch' and is declared in the init.d file.

Note that this setting is only used by the init script. If changed, make sure that

the configured user can read and write into the data, work, plugins and log directories.

For systemd service, the user is usually configured in file /usr/lib/systemd/system/elasticsearch.service

#ES_USER=elasticsearch
#ES_GROUP=elasticsearch

The number of seconds to wait before checking if Elasticsearch started successfully as a daemon process

ES_STARTUP_SLEEP_TIME=5

################################

System properties

################################

Specifies the maximum file descriptor number that can be opened by this process

When using Systemd, this setting is ignored and the LimitNOFILE defined in

/usr/lib/systemd/system/elasticsearch.service takes precedence

#MAX_OPEN_FILES=65535

The maximum number of bytes of memory that may be locked into RAM

Set to "unlimited" if you use the 'bootstrap.mlockall: true' option

in elasticsearch.yml (ES_HEAP_SIZE must also be set).

When using Systemd, the LimitMEMLOCK property must be set

in /usr/lib/systemd/system/elasticsearch.service

MAX_LOCKED_MEMORY=unlimited

Maximum number of VMA (Virtual Memory Areas) a process can own

When using Systemd, this setting is ignored and the 'vm.max_map_count'

property is set at boot time in /usr/lib/sysctl.d/elasticsearch.conf

#MAX_MAP_COUNT=262144

It seems like you are suffering from heap pressure. What is the full output of the cluster stats API? It is very likely that the huge index and shard count have resulted in quite a large cluster state that need to be held on all nodes. You might be able to check this through the cluster state API but I do not remember what that contained in this ancient version. Reducing the number of indices and shards should help reduce the amount of heap used.

Another thing to look for in older versions is whether you have optimized your mappings. By default all text fields are indexed both as an analyzed and a not analyzed version (the raw subfield). The analyzed version can in older versions take up a lot of heap space, so make sure your fields are only mapped as analyzed if necessary. Make sure you use doc_values as much as possible. Changing this requires data to be reindexed, so I would recommend addressing this at the same time you work to reduce index and shard count.

The easiest way to resolve heap space issues without affecting the data is however to add resources, e.g. additional nodes. This will mean rebalancing data, which due to your large shard count and small shards has the potential to be quite slow.

You can naturally also choose to delete older data to free up space.

I had a look through your shard listings and it seems the largest shard in your cluster is around 250MB in size, which is very small. Keeping shard sizes between 10GB and 50GB in size is often recommended when dealing with time series data. I would therefore recommend that you as soon as possible change any index templates to use a single primary shard and change your indexing to use monthly rather than daily indices. As data ages out of the cluster this will make sure the situation does not get worse and that shard count is reduced over time. If this was implemented across the board for all data, e.g. through reindexing and consolidation of indices, it would reduce the shard count by a factor of around 180, which would be much more reasonable and in line with best practices.

Actually i have allocated 50% of RAM size to heap. (/etc/sysconfig/elasticsearch) --> ES_HEAP_SIZE=16g
but i found this, now how to increase the heap.
ps aux | grep --color=auto -i Xms
497 8656 13.0 2.6 3127592 863520 ? SLl Oct18 313:33 /usr/bin/java -Xms512m -Xmx512m -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Djna.nosys=true -Des.path.home=/usr/share/elasticsearch -cp /usr/share/elasticsearch/lib/elasticsearch-2.3.4.jar:/usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch start -p /var/run/elasticsearch/elasticsearch.pid -d -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/var/lib/elasticsearch -Des.default.path.conf=/etc/elasticsearch
497 36804 32.5 2.5 3090376 837096 ? SLl 05:33 68:53 /usr/bin/java -Xms512m -Xmx512m -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Djna.nosys=true -Des.path.home=/usr/share/elasticsearch -cp /usr/share/elasticsearch/lib/elasticsearch-2.3.4.jar:/usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch start -p /var/run/elasticsearch/elasticsearch.pid -d -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/var/lib/elasticsearch -Des.default.path.conf=/etc/elasticsearch
root 39467 0.0 0.0 103320 924 pts/0 S+ 09:05 0:00 grep --color=auto -i Xms

If your heap size is that small it would most certainly explain the issues. It seems the environment variable is not getting picked up by the user running Elasticsearch. Have not use Elasticsearch 2.x in years do can not offer much more help.

Ok.. let see anyone can help this? @dadoonet - will you ?

Making that environment variable visible for the Elasticsearch process is probably more of a Linux question than Elasticsearch specific.

I got it. I have updated the heap size and its working fine now.

Just for other if any one might face this in future:
directly update the heap size in this location etc/init.d/elasticsearch

Thank you so much @Christian_Dahlqvist @dadoonet. :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.