I wanted to know if we can use ES and SAN storage or to use the local Disqus
they are the right solution based on your experience please ?
We strongly recommend to use local storage whenever possible.
We ran several experiments with indexing streaming data to an Openstack VM cluster with no local storage. and performance was bad. It especially is bad with all IO bound operations (which most of indexing/searching is).
We were spending over 95% of the time waiting on IO operations to finish.
So yeah Local storage is a gajillion times better..
I understand that local disk is faster (provided engineered correctly) but a generalization that SAN sucks would imply what sort of subsystem? We have 90% of our working database environment on SAN storage of some sorts, most of it with <1.5ms response times. Is there some benchmarks you can point to that show even the best designed SAN disks under ES perform say 10x slower than an equivalent number of reasonably built local disks? In this case it would be VMware running over 8Gb FC to a bunch of DRAM/SSD/15k drives in a pool dedicated to the app.
If your SAN is much nicer than your local disks and you have gobs of clear bandwidth then it'll be faster. It adds an extra single point of failure but if everything you run is already on the SAN you probably have redundant heads and stuff and losing Elasticsearch wouldn't be your biggest problem if the SAN dropped.
Just make sure the SAN looks just like a physical disk. Don't use NFS or something that doesn't do local disk things. If you do that it'll probably be just like a local disk so I wouldn't worry too much. I'd be comfortable deploying that. But be warned when @jpountz tells you it isn't recommended. It isn't.
It's not recommended, but it'd be worth you testing it anyway so you can see why.
Ok.. the problem with "testing" the idea is the production location we will be deploying to is "logistically challenged" meaning it is a PITA to get things racked and deployed, so, the option of building a bunch of prod-ready physical nodes in a prod environment is easier thought than done. However, adding SSD drives to a SAN takes more time to cut a PO than actually install them in the array. The other thing I got to thinking was is there a way to disable the replica copies if we deploy ES on the SAN so that we are not storing copies of copies? Doing replica copies to SAN disks would logically imply crappier performance if whoever configured it didn't consider the RAID built into the SAN array. Regardless, we are considering the advice as noted.