Additional data path not used

karlmaresch · April 22, 2021, 9:18am

Hi,

We currently have an elasticsearch cluster with two seperate equal servers as two data nodes - one master, one slave.
The servers each have 1 TB of diskspace. However, master is now at 81% of disk usage and slave is at 74%. I'm not sure why they are not equally used since each index and document should be present on both nodes but I read that this still can happen.

Problem:
We now decided to add an additional 1TB disk to each of the nodes before upgrading to a third server/node. In the /etc/elasticsearch/elasticsearch.yml files the additional path of the new disk was added.

Current configuration there:

# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch,/elasticsearch_data
#
# Path to log files:
#
path.logs: /var/log/elasticsearch
#

Furthermore the ownership of the new disk was edited via

sudo chmod -Rv 777 /elasticsearch_data

When shutting down elasticsearch nodes and restarting them all indices are green again, but the new disk stays for multiple weeks at just 75MB of usage while the usage of the original disks keeps growing.

When cd ing into the new created path /elasticsearch_data there is now a structure with /nodes/0/…. With indices and node.lock in there.

However, it continues to not add any data to that disk.

Am I missing anything and this behaviour is expected?

Thanks!

edit: Elasticsearch version 7.6.2

warkolm · April 27, 2021, 2:56am

Welcome to our community!

There is no such thing as a slave node in Elasticsearch.

Are you creating new indices?

What do your logs show?

karlmaresch · April 27, 2021, 7:18am

Thanks @warkolm you guys are doing an awesome job with elasticsearch!

No, currently there are no new indices created. But the way I understood the description it should also use new space for existing indices right? Otherwise the size of a single index would be limited by the size of the single disk it landed on initially. Is this the case?

When checking the second data path it shows in /elasticsearch_data/nodes/0/indices basically all nodes that I also get when calling http://myelasticserver/_cat/indices?. But, as said, size of diskspace in use is still only 75MB for weeks.

In the logs that I checked I was not able to find any Error in that regard. Also when calling http://95.217.108.253:9200/_cluster/stats?human&pretty I get for example

"fs": {
  "total": "3.6tb",
  "total_in_bytes": 3959280877568,
  "free": "2.3tb",
  "free_in_bytes": 2536149753856,
  "available": "2.1tb",
  "available_in_bytes": 2334777802752
},

Which again shows to me that the additional disk is recognized by elasticsearch as available free storage. It just never uses that one.

warkolm · April 27, 2021, 7:25am

If you can post your logs from Elasticsearch startup, it'd be helpful.

DavidTurner · April 27, 2021, 9:00am

I think that's expected, Elasticsearch won't rebalance shards across the paths, and given that you only have two nodes it can't rebalance them across nodes either. Support for multiple data paths is being removed so it's unlikely this will ever be fixed now.

IMO a better solution is to add more nodes to use the extra disks. Alternatively you could use LVM or RAID or something to make a single filesystem that spans both disks, but that might be tricky if you aren't already using LVM.

karlmaresch · April 28, 2021, 3:18pm

> [2021-04-28T17:06:29,550][INFO ][o.e.n.Node               ] [master] stopping ...
> [2021-04-28T17:06:29,571][INFO ][o.e.x.w.WatcherService   ] [master] stopping watch service, reason [shutdown initiated]
> [2021-04-28T17:06:29,573][INFO ][o.e.x.w.WatcherLifeCycleService] [master] watcher has stopped and shutdown
> [2021-04-28T17:06:29,794][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [master] [controller/7397] [Main.cc@150] Ml controller exiting
> [2021-04-28T17:06:29,795][INFO ][o.e.x.m.p.NativeController] [master] Native controller process has stopped - no new native processes can be started                                                                                                                         [2021-04-28T17:06:34,570][INFO ][o.e.n.Node               ] [master] stopped                                                                                                                                                                                                 [2021-04-28T17:06:34,571][INFO ][o.e.n.Node               ] [master] closing ...
> [2021-04-28T17:06:34,595][INFO ][o.e.n.Node               ] [master] closed
> [2021-04-28T17:06:53,790][INFO ][o.e.e.NodeEnvironment    ] [master] using [2] data paths, mounts [[/elasticsearch_data (/dev/nvme1n1p1), / (/dev/md2)]], net usable_space [1tb], net total_space [1.8tb], types [ext4, ext3]
> [2021-04-28T17:06:53,792][INFO ][o.e.e.NodeEnvironment    ] [master] heap size [29.8gb], compressed ordinary object pointers [true]
> [2021-04-28T17:06:54,235][INFO ][o.e.n.Node               ] [master] node name [master], node ID [vEpkfBqTTLCOqBIuKFuT5Q], cluster name [sb-es-cluster]
> [2021-04-28T17:06:54,236][INFO ][o.e.n.Node               ] [master] version[7.6.2], pid[27228], build[default/deb/ef48eb35cf30adf4db14086e8aabd07ef6fb113f/2020-03-26T06:34:37.794943Z], OS[Linux/4.15.0-99-generic/amd64], JVM[AdoptOpenJDK/OpenJDK 64-Bit Server VM/13.0.2/13.0.2+8]
> [2021-04-28T17:06:54,236][INFO ][o.e.n.Node               ] [master] JVM home [/usr/share/elasticsearch/jdk]
> [2021-04-28T17:06:54,236][INFO ][o.e.n.Node               ] [master] JVM arguments [-Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dio.netty.allocator.numDirectArenas=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.locale.providers=COMPAT, -Xms30g, -Xmx30g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.io.tmpdir=/tmp/elasticsearch-14245620407985491030, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/var/lib/elasticsearch, -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -XX:MaxDirectMemorySize=16106127360, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/etc/elasticsearch, -Des.distribution.flavor=default, -Des.distribution.type=deb, -Des.bundled_jdk=true]
> [2021-04-28T17:06:55,419][INFO ][o.e.p.PluginsService     ] [master] loaded module [aggs-matrix-stats]
> [2021-04-28T17:06:55,419][INFO ][o.e.p.PluginsService     ] [master] loaded module [analysis-common]
> [2021-04-28T17:06:55,419][INFO ][o.e.p.PluginsService     ] [master] loaded module [flattened]
> [2021-04-28T17:06:55,419][INFO ][o.e.p.PluginsService     ] [master] loaded module [frozen-indices]
> [2021-04-28T17:06:55,420][INFO ][o.e.p.PluginsService     ] [master] loaded module [ingest-common]
> [2021-04-28T17:06:55,420][INFO ][o.e.p.PluginsService     ] [master] loaded module [ingest-geoip]
> [2021-04-28T17:06:55,420][INFO ][o.e.p.PluginsService     ] [master] loaded module [ingest-user-agent]
> [2021-04-28T17:06:55,420][INFO ][o.e.p.PluginsService     ] [master] loaded module [lang-expression]
........
> [2021-04-28T17:06:55,422][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-ccr]
> [2021-04-28T17:06:55,422][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-core]
> [2021-04-28T17:06:55,422][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-deprecation]
> [2021-04-28T17:06:55,422][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-enrich]
> [2021-04-28T17:06:55,422][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-graph]
> [2021-04-28T17:06:55,422][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-ilm]
> [2021-04-28T17:06:55,422][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-logstash]
> [2021-04-28T17:06:55,423][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-ml]
> [2021-04-28T17:06:55,423][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-monitoring]
> [2021-04-28T17:06:55,423][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-rollup]
> [2021-04-28T17:06:55,423][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-security]
> [2021-04-28T17:06:55,423][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-sql]
> [2021-04-28T17:06:55,423][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-voting-only-node]
> [2021-04-28T17:06:55,423][INFO ][o.e.p.PluginsService     ] [master] loaded module [x-pack-watcher]
> [2021-04-28T17:06:55,423][INFO ][o.e.p.PluginsService     ] [master] no plugins loaded
> [2021-04-28T17:06:57,534][INFO ][o.e.x.s.a.s.FileRolesStore] [master] parsed [0] roles from file [/etc/elasticsearch/roles.yml]
> [2021-04-28T17:06:57,855][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [master] [controller/27342] [Main.cc@110] controller (64 bit): Version 7.6.2 (Build e06ef9d86d5332) Copyright (c) 2020 Elasticsearch BV
> [2021-04-28T17:06:58,189][DEBUG][o.e.a.ActionModule       ] [master] Using REST wrapper from plugin org.elasticsearch.xpack.security.Security
> [2021-04-28T17:06:58,257][INFO ][o.e.d.DiscoveryModule    ] [master] using discovery type [zen] and seed hosts providers [settings]
> [2021-04-28T17:06:58,730][INFO ][o.e.n.Node               ] [master] initialized
> [2021-04-28T17:06:58,730][INFO ][o.e.n.Node               ] [master] starting ...
> [2021-04-28T17:06:58,831][INFO ][o.e.t.TransportService   ] [master] publish_address {server1:9300}, bound_addresses {server1:9300}
> [2021-04-28T17:06:59,307][INFO ][o.e.b.BootstrapChecks    ] [master] bound or publishing to a non-loopback address, enforcing bootstrap checks
> [2021-04-28T17:06:59,326][INFO ][o.e.c.c.Coordinator      ] [master] cluster UUID [4Nd73AqsR0iUFMw2X-CY0g]
> [2021-04-28T17:06:59,483][INFO ][o.e.c.r.a.AllocationService] [master] updating number_of_replicas to [0] for indices [.apm-agent-configuration, .kibana_task_manager_1, .security-7, .kibana_1]
> [2021-04-28T17:06:59,485][INFO ][o.e.c.s.MasterService    ] [master] elected-as-master ([1] nodes joined)[{master}{vEpkfBqTTLCOqBIuKFuT5Q}{S2RADHtVT7GZcD2pTtzhSg}{server1}{server1:9300}{dilm}{ml.machine_memory=67543732224, xpack.installed=true, ml.max_open_jobs=20} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 29, version: 20444, delta: master node changed {previous [], current [{master}{vEpkfBqTTLCOqBIuKFuT5Q}{S2RADHtVT7GZcD2pTtzhSg}{server1}{server1:9300}{dilm}{ml.machine_memory=67543732224, xpack.installed=true, ml.max_open_jobs=20}]}
> [2021-04-28T17:06:59,589][INFO ][o.e.c.s.ClusterApplierService] [master] master node changed {previous [], current [{master}{vEpkfBqTTLCOqBIuKFuT5Q}{S2RADHtVT7GZcD2pTtzhSg}{server1}{server1:9300}{dilm}{ml.machine_memory=67543732224, xpack.installed=true,
> ml.max_open_jobs=20}]}, term: 29, version: 20444, reason: Publication{term=29, version=20444}
> [2021-04-28T17:06:59,613][INFO ][o.e.h.AbstractHttpServerTransport] [master] publish_address {server1:9200}, bound_addresses {server1:9200}
> [2021-04-28T17:06:59,614][INFO ][o.e.n.Node               ] [master] started
> [2021-04-28T17:07:00,194][INFO ][o.e.l.LicenseService     ] [master] license [a84a1202-3ce9-4455-8b55-515792ecd565] mode [basic] - valid
> [2021-04-28T17:07:00,195][INFO ][o.e.x.s.s.SecurityStatusChangeListener] [master] Active license is now [BASIC]; Security is disabled
> [2021-04-28T17:07:00,199][INFO ][o.e.g.GatewayService     ] [master] recovered [115] indices into cluster_state
> [2021-04-28T17:07:00,877][INFO ][o.e.c.r.a.AllocationService] [master] updating number_of_replicas to [1] for indices [.apm-agent-configuration, .kibana_task_manager_1, .security-7, .kibana_1]
> [2021-04-28T17:07:00,879][INFO ][o.e.c.s.MasterService    ] [master] node-join[{slave}{R3yCJJkWSV2UUKVk78rd5w}{SuAR_ovWQAe2kFPQIfSXCQ}{server2}{server2:9300}{dilm}{ml.machine_memory=67543728128, ml.max_open_jobs=20, xpack.installed=true} join existing leader], term: 29, version: 20450, delta: added {{slave}{R3yCJJkWSV2UUKVk78rd5w}{SuAR_ovWQAe2kFPQIfSXCQ}{server2}{server2:9300}{dilm}{ml.machine_memory=67543728128, ml.max_open_jobs=20, xpack.installed=true}}
> [2021-04-28T17:07:01,239][INFO ][o.e.c.s.ClusterApplierService] [master] added {{slave}{R3yCJJkWSV2UUKVk78rd5w}{SuAR_ovWQAe2kFPQIfSXCQ}{server2}{server2:9300}{dilm}{ml.machine_memory=67543728128, ml.max_open_jobs=20, xpack.installed=true}}, term: 29, version: 20450, reason: Publication{term=29, version=20450}
> [2021-04-28T17:07:03,073][INFO ][o.e.c.m.MetaDataIndexTemplateService] [master] adding template [.management-beats] for index patterns [.management-beats]
> [2021-04-28T17:07:06,460][WARN ][o.e.c.r.a.AllocationService] [master] [en-04-2021-news-index][1] marking unavailable shards as stale: [cbuhmuxDS7i34Y-89B-MsQ]
> [2021-04-28T17:07:07,423][WARN ][o.e.c.r.a.AllocationService] [master] [en-04-2021-news-index][2] marking unavailable shards as stale: [XC-VOCs-Sr65LzfMxu3d2A]
> [2021-04-28T17:07:07,882][WARN ][o.e.c.r.a.AllocationService] [master] [en-04-2021-news-index][0] marking unavailable shards as stale: [KKj8T9V4QLWdQiFUCvJ9Bw]
> [2021-04-28T17:07:07,883][WARN ][o.e.c.r.a.AllocationService] [master] [de-04-2021-news-index][2] marking unavailable shards as stale: [vWVqY7XgTaue6ZcUm7yxRw]
> [2021-04-28T17:07:08,353][WARN ][o.e.c.r.a.AllocationService] [master] [other-04-2021-news-index][0] marking unavailable shards as stale: [-ECZpl2LTqKO1svwI6yg9w]
> [2021-04-28T17:07:08,353][WARN ][o.e.c.r.a.AllocationService] [master] [other-04-2021-news-index][2] marking unavailable shards as stale: [II2_FPW8Rz6W7nhZNbXwgQ]
> [2021-04-28T17:07:11,797][WARN ][o.e.c.r.a.AllocationService] [master] [other-04-2021-news-index][1] marking unavailable shards as stale: [M9irO4EeR0uIxb0468mKHA]
> [2021-04-28T17:07:12,244][WARN ][o.e.c.r.a.AllocationService] [master] [de-04-2021-news-index][1] marking unavailable shards as stale: [PHosvLmCQ1CdIaxgGXLQgw]
> [2021-04-28T17:07:13,076][WARN ][o.e.c.r.a.AllocationService] [master] [de-04-2021-news-index][0] marking unavailable shards as stale: [HkBKYEF0R2KgBdr8LB3V8w]
> [2021-04-28T17:07:26,779][INFO ][o.e.c.r.a.AllocationService] [master] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[at-companywebsite-index][2], [at-companywebsite-index][1]]]).
> [2021-04-28T17:08:29,161][INFO ][o.e.c.r.a.AllocationService] [master] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.apm-agent-configuration][0]]]).

Thanks for making me aware of that upcomming change. So what would be "correct" way of doing it then to be future proof? - Just adding more nodes seemed to be unnecessary high costs when I compare the costs of an elasticsearch node / server like we have it currently to the costs of a 1TB disk.

Would it be a working option to drop a bigger index and recreate it? Would it then get added to the new disk if there is more free space available?

Thanks!

DavidTurner · April 28, 2021, 3:31pm

By "node" I meant "Elasticsearch process" -- you can run more than one of these on the same physical host. However ...

... IMO the best way is to combine all your disks into a single volume using RAID or LVM.

Yes I would expect newly-created shards to be placed on the emptier disks. Maybe not all of them, there is some preference for putting different shards of the same index onto different disks, but certainly the first shard copy of each new index will go onto the emptier disk.

The physical location of each shard tends to be pretty well-hidden in the APIs. The only place I know to find it is in the output of the indices stats API if you set ?level=shards.

warkolm · April 28, 2021, 11:33pm

karlmaresch:

> [2021-04-28T17:06:53,790][INFO ][o.e.e.NodeEnvironment    ] [master] using [2] data paths, mounts [[/elasticsearch_data (/dev/nvme1n1p1), / (/dev/md2)]], net usable_space [1tb], net total_space [1.8tb], types [ext4, ext3]

Ok you can see it's using both on start.

As David mentioned though, it doesn't rebalance.

system · May 26, 2021, 11:34pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Adding more path.data folders and cluster becomes red Elasticsearch	14	5006	July 5, 2017
Moving disk to another data node Elasticsearch	3	310	May 11, 2021
Add new storage for elasticsearch Elasticsearch	2	4124	February 12, 2019
Increase data capacity Elasticsearch	3	914	July 6, 2017
Utilization all path.data in elasticsearch Elasticsearch	3	418	December 28, 2018

Additional data path not used

Related topics