I have a production cluster running 20 nodes that I would like to configure audit logs for. Everything seems to be running correctly on 3 nodes, on the rest it stops logging events. When I restart the service on that particular node, it logs for a while and then stops. Am I missing something here? How should I debug this?
Hello @ryuujin , welcome to the community !
Have you compared the last logs that can be seen in ES index with the actual log messages present in audit log files for those servers where it is missing ? It could be the case that other nodes are handling any request hence no logs are generated.
I think you are right, thank you for pointing me in this direction. So if I understand this correctly, normally requests can be handled by a small number of nodes, and the rest is just transporting shards between each other and does not need to handle user requests?
Well, its partially correct. Essentially the bulk writing request is handled by nodes which have primary shards for the target indices.
However, when it comes to search requests, all the nodes which have either primary or replica shards are involved to lookup for requested data. The result sets are then aggregated and sent as response by coordinator node, which happens to be the center of accepting and forwarding any request hitting ES cluster.
Thank you for explaining. So when a node is chosen as a coordinator node, it will then handle any incoming requests, and there is no need to handle anything explicitly. I should just aggregate audit logs from all nodes, because there is a chance that coordinator role will be assigned to any of the data nodes. Or are only master eligible nodes chosen for that?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.