Utilization all path.data in elasticsearch

Is it impossible?
We have 3 elasticsearch servers with similar config:

path:
  data:
    - /data/1/elasticsearch
    - /data/2/elasticsearch
    - /data/3/elasticsearch
    - /data/4/elasticsearch

df -h:

/dev/sda4       5,2T  239G  4,7T   5% /data/1
/dev/sdb4       5,2T  304G  4,7T   7% /data/2
/dev/sdc4       5,2T  238G  4,7T   5% /data/3
/dev/sdd4       5,2T  283G  4,7T   6% /data/4

from filebeats -> logstashes -> elastic we receive many logs (around 30.000/sec). I disable replication (we don't afraid to lose logs).
On day we have two index - filebeat-7d, filebeat2-7d, delete index older than 7 days.
Shards per index I didn't change.
In grafana I view that not all disks are using
node1 node2

It will be cool, if I may be to force elasticsearch to use all path.data. it seems to me, that I must change default shards per index - increase or decrease?
In google, I search and find, that for high perfomance I need decrease shards per index.
But, logically:
3 servers * 4 path.data = 12
6 shards per index?
Could any one explain how right choose shards or another option for utilization all disks?

Elasticsearch balances shards across nodes, but not across data paths within each node. If you want this kind of balancing, I suggest you try running 12 nodes, each on a single data path.

thank you very much

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.