Adding more path.data folders and cluster becomes red

Hello,

I'm facing with a problem I can't get rid off of it.

I'm running a two nodes cluster Elasticsearch, 1 master + 1 node. Everything is running smoothly and all the indices are green, up and running (though no replica right now).

My current elasticsearch.yml configuration is:

path.data = /path/to/data

However I wanted to add an additional path (LVM volume) to expand Elasticsearch's disk size. I did shut down the ES data node,

curl -XPUT 'http://localhost:9200/_cluster/settings' -d '{"transient" : {"cluster.routing.allocation.enable" : "none"}}'
curl -XPOST 'http://localhost:9200/_cluster/nodes/_local/_shutdown'

then I changed the elasticsearch.yml conf file as follows:

path.data = ["/path/to/data", "/path/to/newdata"]

And I restarted the data node followed by:

curl -XPUT 'http://localhost:9200/_cluster/settings' -d '{"transient" : {"cluster.routing.allocation.enable" : "all"}}'

The cluster immediately turned red with all the shards unassigned.

I did shut down again the node, removed the second path, restarted the cluster and everything went green again. Note that ElasticSearch correctly detected the new data path and indeed the global disk space was the sum of the two folders.

How can I add a second path to the ES data node to increase disk space and having ElasticSearch correctly recognizing it?

Many thanks in advance for your help!

1 Like

Are there any strange outputs in the logs?

Set the loglevel to debug and take a look at $ES-HOME/logs/*

Here is some information to help you doing this:
https://www.elastic.co/guide/en/elasticsearch/guide/current/logging.html

nevermind, better try this:

path.data: ["/mnt/first", "/mnt/second"]
or this
path.data: /mnt/first,/mnt/second

https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-dir-layout.html

I'm already doing path.data that way.

Let me add something more:

_nodes/stats (BEFORE)

 "fs": {
    "timestamp": 1445875849977,
    "total": {
       "total_in_bytes": 50647003136,
       "free_in_bytes": 39121285120,
       "available_in_bytes": 36850778112,
       "disk_reads": 6555,
       "disk_writes": 3959,
       "disk_io_op": 10514,
       "disk_read_size_in_bytes": 117785600,
       "disk_write_size_in_bytes": 34197504,
       "disk_io_size_in_bytes": 151983104,
       "disk_queue": "0",
       "disk_service_time": "0"
    },
    "data": [
       {
          "path": "/data/cluster-name/nodes/0",
          "mount": "/",
          "dev": "/dev/sda1",
          "type": "ext4",
          "total_in_bytes": 50647003136,
          "free_in_bytes": 39121285120,
          "available_in_bytes": 36850778112,
          "disk_reads": 6555,
          "disk_writes": 3959,
          "disk_io_op": 10514,
          "disk_read_size_in_bytes": 117785600,
          "disk_write_size_in_bytes": 34197504,
          "disk_io_size_in_bytes": 151983104,
          "disk_queue": "0",
          "disk_service_time": "0"
       }
    ]
 },

_nodes/stats (AFTER)

"fs": {
        "timestamp": 1445876141872,
        "total": {
           "total_in_bytes": 940360904704,
           "free_in_bytes": 649207984128,
           "available_in_bytes": 626626637824,
           "disk_reads": 8840,
           "disk_writes": 246,
           "disk_io_op": 9086,
           "disk_read_size_in_bytes": 127649792,
           "disk_write_size_in_bytes": 13971456,
           "disk_io_size_in_bytes": 141621248,
           "disk_queue": "0",
           "disk_service_time": "0"
        },
        "data": [
           {
              "path": "/data/cluster-name/nodes/0",
              "mount": "/",
              "dev": "/dev/vda1",
              "type": "ext4",
              "total_in_bytes": 422616936448,
              "free_in_bytes": 131537268736,
              "available_in_bytes": 114234032128,
              "disk_reads": 8524,
              "disk_writes": 232,
              "disk_io_op": 8756,
              "disk_read_size_in_bytes": 126358528,
              "disk_write_size_in_bytes": 13914112,
              "disk_io_size_in_bytes": 140272640,
              "disk_queue": "0",
              "disk_service_time": "0"
           },
           {
              "path": "/data-new/cluster-name/nodes/0",
              "mount": "/data-new",
              "dev": "/dev/mapper/vg0-lvol0",
              "type": "ext4",
              "total_in_bytes": 517743968256,
              "free_in_bytes": 517670715392,
              "available_in_bytes": 512392605696,
              "disk_reads": 316,
              "disk_writes": 14,
              "disk_io_op": 330,
              "disk_read_size_in_bytes": 1291264,
              "disk_write_size_in_bytes": 57344,
              "disk_io_size_in_bytes": 1348608
           }
        ]
     },

The only different thing I see when I add this second path is the following DEBUG error in the log file:

[2015-10-26 16:17:01,012][DEBUG][action.search.type       ] [node-name] All shards failed for phase: [query]
org.elasticsearch.action.NoShardAvailableActionException: [data-index][14]

which loops endlessy

Do the directories have the same permissions?

Yes, exactly the same permissions :disappointed_relieved:

Do you have very important data in your cluster?

Can you try to only include the new path and remove the old path?

I dont have any other ideas at the moment, im sorry

Sadly I cannot do such kind of tests as the cluster contains critical data :frowning:

Ok, maybe you can manage to create a temporary second cluster somewhere safe. On another VM or locally. Unfortunately I dont have the time to try this out right now.

That's not actually a restart though, that just re-enables allocation.
Did you restart the actual ES process on the node you issued the shutdown to?

Actually I rebooted whole server, not just the process :frowning:

Should the master node follow the same multi-path configuration (even though the master is dataless?)

I am not sure if this is necessary, but I'd probably do it just to be sure

I'm also experiencing the same problem. At the very least I had to delete the index from the old path.data so that I could access kibana again, but the cluster health is still "red" and no indices are visible in kibana, but the data exists on disk.