Running out of space in /var/lib/elasticsearch

I currently have a single node Elastic Stack (with Winlogbeat) instance setup using mostly default settings.

We're running out of storage space, so we need to add another location for Elasticsearch data nodes to store indices.

I wanted to know what happens behind the scenes when I execute the following steps:

Step 1: Modify /etc/elasticsearch/elasticsearch.yml using your favorite editor and change:

path.data: /var/lib/elasticsearch

to something like:

path:
  data:
    - /var/lib/elasticsearch
    - /mnt/elasticsearch_1
    - /mnt/elasticsearch_2
    - /mnt/elasticsearch_3

Step 2: Make sure elasticsearch:elasticsearch has access full access to /mnt/elasticsearch_* so that it can read, write, delete, etc.

Step 3: Restart the Elasticsearch service by running systemctl restart elasticsearch.

Here are my questions:

Question 1: What will happen to the data currently in path.data: /var/lib/elasticsearch ?
Question 2: Will the indices be distributed between the four location(s) or will they just be replicas?
Question 3: Is there a way to migrate the data from /var/lib/elasticsearch to the new location(s)?

I've come across a few threads, however, they're a bit old and wanted to make sure the steps were the same:

I'd test this out myself, however, I do not have a development box to work with.

Thanks for reading!

  1. Nothing, it will stay there
  2. They will be distributed. You cannot have both primary and replicas shards existing on the same node
  3. I have seen some people create a new location, manually mv things across and then update the config. That's not guaranteed to work, nor supported though.

Thanks for the quick reply, @warkolm.

Is there a supported way to move data from one location to another, or is this something that should have been considered at the beginning of the project?

Since this is only a proof of concept I'm working on, I can add it to my notes and plan my elasticsearch.yml config for production.

Thanks again for your help.

The only way I can think is to add more nodes and then do a bunch of moving around. Which isn't ideal.

Can you remove some of the indices you don't need?

Sure can. I'll be using curator to manage my indices. However, the issue is that our Elastic Stack node was initially setup on a 30GB SSD. :pensive:

I was able to migrate my data without any apparent issues. Instead of copying, I used rsync, which is a safer alternative.

Here's what I did:

Step 1: Create a new directory on the new drive to store your indices:


mkdir /apps/elasticsearch

Step 2: Disable shard allocation.

When you shut down a node, the allocation process waits for index.unassigned.node_left.delayed_timeout (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by disabling allocation before shutting down the node:


PUT _cluster/settings

{

"persistent": {

"cluster.routing.allocation.enable": "none"

}

}

Step 3: Stop indexing and perform a synced flush.

Performing a synced-flush speeds up shard recovery.

You will likely need to stop Logstash or Winlogbeat from sending events to Elasticsearch to stop the failed status codes: systemctl stop logstash.


POST _flush/synced

When you perform a synced flush, check the response to make sure there are no failures. Synced flush operations that fail due to pending indexing operations are listed in the response body, although the request itself still returns a 200 OK status. If there are failures, reissue the request.

Step 4: Stop Elasticsearch:


systemctl stop elasticsearch

Step 5: Copy your indices to the new location:


rsync --info=progress2 -auvrz /var/lib/elasticsearch /apps/elasticsearch

or using copy (not recommended)


cd /var/lib/elasticsearch

cp -RP * /apps/elasticsearch

Step 6: Change the ownership on copied files and directorys to elasticsearch:elasticsearch:


chown -R elasticsearch:elasticsearch /apps/elasticsearch

This step is only necessary when using the copy command.

Step 7: Change the path.data location in elasticsearch.yml using your favorite editor:


path.data: /apps/elasticsearch

If you have multiple locations where you're storing indices, you can use the following syntax:


path:

data:

- /mnt/elasticsearch_1

- /mnt/elasticsearch_2

- /mnt/elasticsearch_3

Step 9: Start Elasticsearch:


systemctl start elasticsearch

Step 10: Reenable allocation.

When all nodes have joined the cluster and recovered their primary shards, reenable allocation by restoring cluster.routing.allocation.enable to its default:


PUT _cluster/settings

{

"persistent": {

"cluster.routing.allocation.enable": null

}

}

Once allocation is reenabled, the cluster starts allocating replica shards to the data nodes. At this point it is safe to resume indexing and searching, but your cluster will recover more quickly if you can wait until all primary and replica shards have been successfully allocated and the status of all nodes is green.

It will take several minutes (or longer) depending on how many documents you have stored in your indices.

More information on this can be found on the following documentation pages:

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.