Currently, I'm using elastic search to store and query some
logs. We set up a five node elastic search cluster. Among them two
indexing nodes and three query nodes. In the indexing node, we have
redis, logstash and elasticsearch on both two servers. The elasticsearch
uses NFS storage as data store. Our requirement is to index 300 log
entries/second. But the best performance I can get from elasticsearch is
only 25 log entries/second!
I know that elasticsearch is not quite "compatible" with NFS. But our performance requirement is not that high, 300-400 log entries/s is enough. Plus, if don't use NFS, can we use iScsi or other kind of network storage? And if we upgrade network interface to 10G, will it be better? BTW, people will have to use local storage and can't use SAN?
Its almost certainly better to use iSCSI than NFS. Direct attached storage is going to be better if you have decent direct attached storage, but if you are with a shop that just loves their SAN then it'll do. Its unusual to use a SAN but so long as you treat it like a regular disk it should be safe.
NFS is known to cause trouble though I personally don't know the whole story. I've seen it do some horrible things in the past and I don't know if its improved since then.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.