Thanks for your answer. So, the DateTime parsing failures are a known failures that'll be fixed soon. But i have a question: Can this failure make a total cluster stop?
Throwing an exception is one of the more expensive things you can do in Java. If you're doing this on every doc in a firehose of log records I imagine that could get costly.
Also, it seems that you are sending these requests to a master node. You should not send requests to master nodes if you want better cluster stability (it is even worse if the requests generates parsing exceptions).
Consider using dedicated masters (an odd number, like 1, 3, 5 - 3 dedicated masters is recommended for high availability). They don't need to be as beefy as your data nodes (2GB and 4 cores should be more than enough for your cluster) but they need to fail independently. Also, remember to set discovery.zen.minimum_master_nodes to "(total_masters / 2) + 1".
I'll put 3 dedicated masters on in my cluster and after that, i'll update this thread with the results.
Also, i would like to know if exists a formula to calculate the ideal number of master nodes in the cluster.
There is no formula besides that it should be an odd number and that they fail independently. 3 dedicated masters will give you high availability (2 must fail at the same time for the cluster to become unavailable).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.