Currently, I'm using elastic search to store and query some
logs. We set up a five node elastic search cluster. Among them two
indexing nodes and three query nodes. In the indexing node, we have
redis, logstash and elasticsearch on both two servers. The elasticsearch
uses NFS storage as data store. Our requirement is to index 300 log
entries/second. But the best performance I can get from elasticsearch is
only 25 log entries/second!
Here's the detailed information in StackOverflow: http://stackoverflow.com/questions/34449405/bad-indexing-performance-of-elasticsearch
My question is:
Can anyone tell me what is elastic search doing and why the indexing is so slow? And is it possible to improve it?
I know that elasticsearch is not quite "compatible" with NFS. But our performance requirement is not that high, 300-400 log entries/s is enough. Plus, if don't use NFS, can we use iScsi or other kind of network storage? And if we upgrade network interface to 10G, will it be better? BTW, people will have to use local storage and can't use SAN?
Can you explain what you mean by these?
This is the cause of your problem.
Its almost certainly better to use iSCSI than NFS. Direct attached storage is going to be better if you have decent direct attached storage, but if you are with a shop that just loves their SAN then it'll do. Its unusual to use a SAN but so long as you treat it like a regular disk it should be safe.
NFS is known to cause trouble though I personally don't know the whole story. I've seen it do some horrible things in the past and I don't know if its improved since then.
That means we have a 5 node elasticsearch cluster. 2 are dedicated to write index and 3 are exposed to customers to search.
But are they client nodes, data nodes, master nodes?
Really? I asked so cause you know iSCSI is also network storage. Will it improve performance of elasticsearch a lot?