Deprecate and remove Multiple Data Paths

Hi there,

While I don't agree that this should be discussed here, but if that helps ¯_(ツ)_/¯
The issue is that Elastic is removing multiple data paths from the configuration that is a big issue when you have a ton of data and need to pay a cloud provider for VMs and disks.
I shared my concerns in the Github issue:

If I understand this correctly, I won't be able to attach more than one disk per Elasticsearch node in the future. If that's right, it'll cause serious problems for me, since in GCP I have limited (and costly) options. To have a 30GB spare memory for Java heap, I need a fairly big instance. Then to keep my VM costs down, I'm attaching multiple 2TB disks that are housing my data, since it'd skyrocket my expenses to create a hot-warm-cold cluster. I need to cap the disk at 2TB because after that I/O performance will drop. So with that change, I either suffer from I/O rate deterioration or have to create a much expensive cluster.
Let me know if I didn't understand something right. Thanks!

As others in that GH issue we don't see any good reason nor a real solution when this will be in GA.

Additionally, the implementation is complex, and not well-tested nor maintained, with practically no benefit over spanning the data path filesystem across multiple drives and/or running one node for each data path.

So this is simply false or wasn't well phrased. For me (the user), it's really simple to implement and it keeps my costs waaaay down. "...running one node fore each data path" is a huge waste of money. I'm using that setup for over 6 years and had 0 issues. I think whoever worked with Elasticsearch, knows that it's robust, it's updated frequently, and the updates are usually great (I saw several changes that affected performance in a really great way). However, limiting myself to one disk which will result in 1/X speed* or moving to use six times more nodes ($800 per VM) is simply unacceptable.

'*: GCP caps the performance of the disks which means if you have five 2TB disks with ~1000 Mbps each but you change to one 10TB disk, you'll have 1/5th of your previous throughput.

Please provide a way to be able to use this product in the future without degrading the indexing performance or blowing up the budget.

Thank you!
Peter

1 Like

No takers for this one?

Why can't you use LVM to create a single volume out of multiple mounted disks?

Hi Christian,

AFAIK, it's not supported ("Logical Volume Manager (LVM) is not supported for Google Cloud provided images."). Also, not sure if you read this comment in the GH issue, but it seems to me that it won't work. However, I never had to use LVM and I'm open to solutions since I have a pretty large cluster to take care of.

Thank you!

In my view LVM is is not good option compare to individual disk as storage point.
LVM will be slower where elastic needs to be fast solution.

software raid is also very slow compare to individual disk. plus other drawback.

Hello,

I'd love to plus one this. I'm happy the deprecation got postponed, as it would have given me so much work and problems. I've been using MDP for more then 5 years and it works perfectly fine.

Except for working without any issue, it gives us a lot of flexibility and options. And seriously reduces costs. When a disk breaks on a raid 0 I need to wait for the new disk and rebuild the storage completely.. Talking about 6TB to 18 TB. When a disk breaks in MDP I can just let the node continue running and just tune my ILM policies a bit.. When the new disk arrives my problem is solved and only one disk needs to sync..

Considering a physical node has suppport for about 5 years and I have 2 generation of nodes running MDP, please note that I would have to redeploy so many hardware.. It would be such a major timesink I seriously would just have given up on on Elastic... Since the deprecation has been postponed, we might be able to deal with the collateral, although it would still cost us a lot of time.

Removing this makes no sense to me at all. Elastic could as well just mention that it is less supported instead of removing it completely?

5 Likes

I have updated to 7.17.1 which is last one before major update to 8.x but can't do it I guess as upgrade assistant says it

Critical Multiple data paths are not supported

well good luck to 8.x now.

:hot_face:

2 Likes

That sounds bad. I still didn't get any update on a solution(!) for this one. As I see, the Github thread is missing that too.

I guess they have already decided. :smirk:

It's always sad to see when a company let stuff die off. The issue was raised in GitHub and here too (as they advised). However, there are no solutions to the issue that'll change using the product greatly. This way people will be forced to move to another similar solution to avoid paying hundreds/thousands of dollars due to the new restriction.
They can't even say that "Hey, we need more money, we don't care about your problem". Though having both threads silently die is the same for me.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.