Azure Blob Storage - Using Blobfuse2

erikg · April 22, 2024, 5:37pm

Hello,
We are currently running self-managed Elastic using Azure VMs.
Due to vast amount of data ingested, 100TB+ we are leveraging Azure's premium disks for our cold and hot data nodes.

We are trying to move away from the premium SSDs , and are considering Azure Blob Storage. There seems to be a way to do this using BlobFuse2.

I am wondering if this is possible to replace SSD Managed Disks with Azure Blob Storage for hot and cold data nodes?

DavidTurner · April 23, 2024, 6:50am

See these docs:

The contents of the path.data directory must persist across restarts, because this is where your data is stored. Elasticsearch requires the filesystem to act as if it were backed by a local disk, but this means that it will work correctly on properly-configured remote block devices (e.g. a SAN) and remote filesystems (e.g. NFS) as long as the remote storage behaves no differently from local storage. You can run multiple Elasticsearch nodes on the same filesystem, but each Elasticsearch node must have its own data path.

The performance of an Elasticsearch cluster is often limited by the performance of the underlying storage, so you must ensure that your storage supports acceptable performance. Some remote storage performs very poorly, especially under the kind of load that Elasticsearch imposes, so make sure to benchmark your system carefully before committing to a particular storage architecture.

In my experience FUSE-based filesystems fail to satisfy the "behaves no differently from local storage" constraint, and also tend to have pretty poor performance overall, but I've never used this particular one.

The best approach would be to use searchable snapshots - this feature is specifically designed to make the best use of blob storage.

erikg · April 23, 2024, 2:33pm

Thanks. It seems like the searchable would be useful, but it requires an Enterprise license.

DavidTurner · April 23, 2024, 2:37pm

That's true, this feature does cost money, but the opex cost savings will more than offset those costs.

Topic		Replies	Views
Elasticsearch storage on containers Elasticsearch docker	3	989	October 4, 2023
Using ElasticSearch(from ECK) storage with Azure remote storages Elasticsearch	1	211	January 5, 2023
ElasticSearch Azure storage Elasticsearch	2	184	January 10, 2023
Optimizing Storage Costs for Historical Data in Elasticsearch on Azure: Seeking Community Advice Elasticsearch slm-snapshot-lifecycle-management	1	235	December 29, 2023
Can i store my indexes on Azure Blob Elasticsearch	2	488	July 6, 2017

Azure Blob Storage - Using Blobfuse2

Related topics