Elasticsearch does not use all path.data

mobidyc · April 20, 2020, 4:45pm

Good morning guys,

I setup multiple path.data but when I check the disk usage, the repartition seems wrong.

    [root@cld0286su ssd9]# grep path.data /etc/elasticsearch/elasticsearch.yml
    path.data: /srv/ssd1/elasticsearch,/srv/ssd2/elasticsearch,/srv/ssd3/elasticsearch,/srv/ssd4/elasticsearch,/srv/ssd5/elasticsearch,/srv/ssd6/elasticsearch,/srv/ssd7/elasticsearch,/srv/ssd8/elasticsearch,/srv/ssd9/elasticsearch

    [root@cld0286su ssd9]# du -hs /srv/ssd*/elasticsearch/
    664K	/srv/ssd1/elasticsearch/
    664K	/srv/ssd2/elasticsearch/
    664K	/srv/ssd3/elasticsearch/
    4,6G	/srv/ssd4/elasticsearch/
    664K	/srv/ssd5/elasticsearch/
    664K	/srv/ssd6/elasticsearch/
    780K	/srv/ssd7/elasticsearch/
    664K	/srv/ssd8/elasticsearch/
    56G	/srv/ssd9/elasticsearch/
    [root@cld0286su ssd9]#

Of course I checked the number of shards (and size) to be sure the everything does not belong to the same shard

    [root@cld0286su ssd9]# curl -s 127.0.0.1:9200/_cat/shards |grep cld0286su |wc -l
    48
    [root@cld0286su ssd9]# 

    [root@cld0286su ssd9]# curl -s 127.0.0.1:9200/_cat/shards |grep cld0286su |grep gb
    stats-disk-2020.04-archive     0  r STARTED   7956473   1.5gb 10.160.186.91  cld0286su
    stats-disk-2019.11-archive     4  p STARTED  22652670   4.5gb 10.160.186.91  cld0286su
    stats-disk-2020.04.11          3  r STARTED  24827005   4.9gb 10.160.186.91  cld0286su
    stats-disk-2020.04.18          0  p STARTED  24328432   4.9gb 10.160.186.91  cld0286su
    stats-disk-2020.04.06          2  r STARTED  25472127   5.2gb 10.160.186.91  cld0286su
    stats-disk-2019.12-archive     3  r STARTED  23155079   4.7gb 10.160.186.91  cld0286su
    stats-disk-2020.03-archive     2  r STARTED  25373517   5.1gb 10.160.186.91  cld0286su
    stats-disk-2019.10-archive     11 r STARTED  23814628   4.7gb 10.160.186.91  cld0286su
    stats-disk-2020.04.15          0  p STARTED  25351113   5.1gb 10.160.186.91  cld0286su
    stats-disk-2019.08-archive     7  r STARTED  23253432   4.7gb 10.160.186.91  cld0286su
    stats-disk-2020.01-archive     8  p STARTED  23153243   4.6gb 10.160.186.91  cld0286su
    stats-disk-2020.02-archive     5  r STARTED  22064042   4.3gb 10.160.186.91  cld0286su
    [root@cld0286su ssd9]#

    [root@cld0286su ssd9]# df -h ../ssd*
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/sdak       184G  130G   54G  71% /srv/ssd1
    /dev/sdal       184G  161G   23G  88% /srv/ssd2
    /dev/sdam       184G  142G   42G  78% /srv/ssd3
    /dev/sdan       184G  108G   76G  59% /srv/ssd4
    /dev/sdao       184G  127G   57G  70% /srv/ssd5
    /dev/sdap       184G  130G   54G  71% /srv/ssd6
    /dev/sdaq       184G  113G   71G  62% /srv/ssd7
    /dev/sdar       184G  120G   65G  65% /srv/ssd8
    /dev/sdbg       373G  156G  218G  42% /srv/ssd9
    [root@cld0286su ssd9]#

Configuration is the following:

    [root@cld0286su ssd9]# cat /etc/elasticsearch/elasticsearch.yml

    bootstrap.system_call_filter: false  # requires kernel 3.5+ with CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER compiled in

    cluster.name: MyCluster
    cluster.routing.allocation.disk.watermark.flood_stage: "99%"
    cluster.routing.allocation.disk.watermark.high: "94%"
    cluster.routing.allocation.disk.watermark.low: "90%"
    discovery.zen.minimum_master_nodes: 26
    discovery.zen.ping.unicast.hosts: ['xxx']
    http.cors.allow-origin: "*"
    http.cors.enabled: true
    network.bind_host: ['xxx']
    network.publish_host: 10.160.186.91
    node.data: True
    node.master: False
    node.name: cld0286su
    path.data: /srv/ssd1/elasticsearch,/srv/ssd2/elasticsearch,/srv/ssd3/elasticsearch,/srv/ssd4/elasticsearch,/srv/ssd5/elasticsearch,/srv/ssd6/elasticsearch,/srv/ssd7/elasticsearch,/srv/ssd8/elasticsearch,/srv/ssd9/elasticsearch
    path.logs: /data/logs/elasticsearch/
    reindex.remote.whitelist: "*:9200, *:9201"
    xpack.security.enabled: false
    xpack.monitoring.enabled: true
    xpack.monitoring.elasticsearch.collection.enabled: true
    discovery.zen.fd.ping_interval: 1s
    discovery.zen.fd.ping_timeout: 2s
    discovery.zen.fd.ping_retries: 3
    [root@cld0286su ssd9]#

I do not see what could be wrong in the disk allocator configuration, cluster is stable and almost never has relocations.

Is there a way to move shards to a specific disk?
Maybe by stopping the node, moving the shard folder and restart the service (but it would be a pain as I have around 50 servers in the same situation) ?

Any help appreciated

willemdh · April 21, 2020, 6:06am

Maybe try data path as array:

path:
  data:
    - /mnt/elasticsearch_1
    - /mnt/elasticsearch_2
    - /mnt/elasticsearch_3

warkolm · April 21, 2020, 6:24am

++ to that, it's how the docs reference it - https://www.elastic.co/guide/en/elasticsearch/reference/7.6/path-settings.html

system · May 19, 2020, 6:24am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Utilization all path.data in elasticsearch Elasticsearch	3	435	December 28, 2018
Adding more path.data folders and cluster becomes red Elasticsearch	14	5073	July 5, 2017
ElasticSearch multiple path.data not storing in all directories Elasticsearch	6	1258	November 8, 2017
Disk configuration recommendations Elasticsearch	7	2695	July 5, 2017
Shard distribution using multiple path.data locations Elasticsearch	2	1179	May 26, 2017

Elasticsearch does not use all path.data

Related topics