Elasticsearch not starting after UID and GID change

rojin · February 22, 2021, 10:52am

Hello! I have a cluster of 2 nodes and I wanted to do the snapshot by setting up the NFS server and a directory as the repository. I had read that all nodes should have the same UID and GID. Thus, I changed them using commands below on both nodes:

usermod -u 1020 elasticsearch
groupmod -g 1030 elasticsearch

but I am not able to restart elasticsearch. Can someone please help me with this?

DavidTurner · February 22, 2021, 12:16pm

Do you see any errors in logs etc.? Not sure we can help much without seeing a bit more detail.

Did you update the ownership of all of Elasticsearch's data too?

rojin · February 22, 2021, 12:52pm

Hi! I will post more details as soon as I can. I have used these commands to change the ownership on both nodes:

chown -R elasticsearch:elasticsearch /etc/elasticsearch

chown -R elasticsearch:elasticsearch /var/lib/elasticsearch

chown -R elasticsearch:elasticsearch /var/log/elasticsearch

chown -R elasticsearch:elasticsearch /usr/share/elasticsearch

I have tested root:elasticsearch on these directories as well. Still not restarting.

rojin · February 22, 2021, 2:07pm

These are my last log lines in node-1:

[2021-02-22T13:11:47,923][WARN ][r.suppressed             ] [node-1] path: /_snapshot/test, params: {pretty=true, repository=test}
org.elasticsearch.repositories.RepositoryVerificationException: [test] [[VNjwVdvSTqSzW5Ggxn3k0A, 'RemoteTransportException[[node-2][192.168.xx.xxx:9300][inte$
        at org.elasticsearch.repositories.VerifyNodeRepositoryAction.finishVerification(VerifyNodeRepositoryAction.java:120) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.repositories.VerifyNodeRepositoryAction.access$000(VerifyNodeRepositoryAction.java:49) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.repositories.VerifyNodeRepositoryAction$1.handleException(VerifyNodeRepositoryAction.java:109) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1172) [elasticsearch-7.9.3.jar:7.$
        at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1172) [elasticsearch-7.9.3.jar:7.$
        at org.elasticsearch.transport.InboundHandler.lambda$handleException$2(InboundHandler.java:235) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:226) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.transport.InboundHandler.handleException(InboundHandler.java:233) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.transport.InboundHandler.handlerResponseError(InboundHandler.java:225) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:115) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:78) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:692) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) [elasticsearch-7.9.3.jar:7.9.3]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:76) [transport-netty4-client-7.9.3.jar$
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.$
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.$
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Fi$
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.$
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.$
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Fi$
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1518) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314) [netty-handler-4.1.49.Final.jar:4.1.49.Final]

[2021-02-22T13:27:52,200][INFO ][o.e.c.r.a.AllocationService] [node-1] updating number_of_replicas to [0] for indices [.monitoring-kibana-7-2021.02.17, .kiba$
[2021-02-22T13:27:52,202][INFO ][o.e.c.s.MasterService    ] [node-1] node-left[{node-2}{VNjwVdvSTqSzW5Ggxn3k0A}{o7aZPvy4RvW1nkJ6Bn7k_A}{192.168.xx.xxx}{192.1$
[2021-02-22T13:27:52,225][INFO ][o.e.c.s.ClusterApplierService] [node-1] removed {{node-2}{VNjwVdvSTqSzW5Ggxn3k0A}{o7aZPvy4RvW1nkJ6Bn7k_A}{192.168.xx.xxx}{19$
[2021-02-22T13:27:52,236][INFO ][o.e.c.r.DelayedAllocationService] [node-1] scheduling reroute for delayed shards in [59.9s] (4 delayed shards)
[2021-02-22T13:27:52,298][WARN ][o.e.c.r.a.AllocationService] [node-1] [.kibana_task_manager_1][0] marking unavailable shards as stale: [hwd2UE1KQG24rY7Rx569$
[2021-02-22T13:27:52,683][WARN ][o.e.c.r.a.AllocationService] [node-1] [ilm-history-2-000001][0] marking unavailable shards as stale: [5CGBAphQRJyy88-qQXbfoA]
[2021-02-22T13:27:56,350][WARN ][o.e.c.r.a.AllocationService] [node-1] [.monitoring-es-7-2021.02.22][0] marking unavailable shards as stale: [pqCcDYO6Rqa3hx1$
[2021-02-22T13:28:01,338][WARN ][o.e.c.r.a.AllocationService] [node-1] [.monitoring-kibana-7-2021.02.22][0] marking unavailable shards as stale: [8-v4A8mVQ3-$
[2021-02-22T13:28:01,898][WARN ][o.e.c.r.a.AllocationService] [node-1] [metricbeat-7.9.3-2021.02.17-000001][0] marking unavailable shards as stale: [gr_ZA4M1$
[2021-02-22T13:28:17,936][WARN ][o.e.c.r.a.AllocationService] [node-1] [.tasks][0] marking unavailable shards as stale: [YJkFmARiQmKk0S3nVi3uLg]
[2021-02-22T13:28:18,076][WARN ][o.e.c.r.a.AllocationService] [node-1] [.security-7][0] marking unavailable shards as stale: [tk2cqtK5QgakcaoSPQ-cnA]
[2021-02-22T13:28:18,854][INFO ][o.e.n.Node               ] [node-1] stopping ...
[2021-02-22T13:28:18,858][INFO ][o.e.x.w.WatcherService   ] [node-1] stopping watch service, reason [shutdown initiated]
[2021-02-22T13:28:18,859][INFO ][o.e.x.w.WatcherLifeCycleService] [node-1] watcher has stopped and shutdown
[2021-02-22T13:28:18,863][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [node-1] [controller/42418] [Main.cc@154] ML controller exiting
[2021-02-22T13:28:18,865][INFO ][o.e.x.m.p.NativeController] [node-1] Native controller process has stopped - no new native processes can be started
[2021-02-22T13:28:19,683][INFO ][o.e.n.Node               ] [node-1] stopped
[2021-02-22T13:28:19,683][INFO ][o.e.n.Node               ] [node-1] closing ...
[2021-02-22T13:28:19,698][INFO ][o.e.n.Node               ] [node-1] closed

DavidTurner · February 22, 2021, 2:19pm

This node was stopped gracefully, and deliberately, by an external influence:

rojin:

[2021-02-22T13:28:18,854][INFO ][o.e.n.Node               ] [node-1] stopping ...
...
[2021-02-22T13:28:19,683][INFO ][o.e.n.Node               ] [node-1] stopped
[2021-02-22T13:28:19,683][INFO ][o.e.n.Node               ] [node-1] closing ...
[2021-02-22T13:28:19,698][INFO ][o.e.n.Node               ] [node-1] closed

rojin · February 22, 2021, 2:22pm

I thought the UID/GID change stopped it. Do I have to reinstall elasticsearch on both nodes?

DavidTurner · February 22, 2021, 2:31pm

Without understanding what's wrong it's impossible to say what needs to be done to fix it.

system · March 22, 2021, 2:31pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch register backup repository with different UID and GID Elasticsearch snapshot-and-restore	7	1380	July 30, 2021
[ES-1.4.0] Changing linux user ID Elasticsearch	3	608	July 6, 2017
Snapshot issue to creating repo Elasticsearch	10	2048	February 8, 2017
Why does creating a repository fail? Elasticsearch	16	4700	July 6, 2017
Trouble with errors when creating a snapshot Elasticsearch snapshot-and-restore	13	1260	June 19, 2024

Elasticsearch not starting after UID and GID change

Related topics