Hi guys!
As a first point, I'm really new in ELK, and in *nix stuff also, I'm mostly a windows guy... So this topic is pretty hard for me.
But I have an annoying issue with ELK. I have a few windows AD domains, and all Domain Controllers (DC) configured to send Security logs to Logstash server with help of local nxlog agents (I'm not sure using the right terminology or not). The ELK stack in my case contains 2 servers (2.3.3), one with Logstash and Kibana next to Elasticsearch cluster, the other one just the another part of the cluster. (servers and clients are WS2012R2)
The ELK stack is installed by my predecessor into domain ALPHA, and over Kibana I able to see logs from Domain Controllers in ALPHA, but that is not true for DCs from domain BRAVO (nor domain CHARLIE or DELTA...). I checked the configuration of local nxlog agents, almost the same, just the destination port is different a little bit, but in LogStash side, both of the ports are handled in the configuration.
I also able to see established connections between LogStash server and a DC in domain BRAVO.
I never saw, but others claim this worked a while ago, and I must believe that's true
Already tried restart the services, restart the computers, servers and clients as well, check health status (green), checked the configurations, check other logs, but unable to find any related information or solve it. My suspicion is a cross domain privilege (access denied) thing, because the nxlog agents run under LOCAL SYSTEM, but also unable to find error events related access denies... The domains are under one forest, fully trusted with each other.
Give me some clue please, where to start the debug! Thanks in advance!
In client side NXLog, I can see the following: "2018-04-24 08:29:57 INFO connecting to 1.2.3.4:1234" without error! (fake ip & port..) On server side also able to see the established connection after this command: "netstat -aon | findstr ":1234" ". So I think the connection between NXLog and LogStash is ok.
Logstash input plugin configuration (part):
tcp {
codec => "json"
port => 1234
tags => ["windows","nxlog"]
type => "nxlog-json-cet"
}
for the rest of the code almost the same, just with different timezone abbreviations.
I have servers in the same DataCenter from different domains, where the NXLog conf are almost the same, except one line. In the <Input Logs> section there is a local variable 'Exec $DCDomain = "ALPHA" ' or ' = "BRAVO" '.
So in my opinion this is not network related. Maybe there is a chance the logs are in Elasticsearch completely, just Kibana not shows them?! How can I proof that?
If I want to be exactly accurate, one of my child domain (ALPHA, where the ELK installed also), and the root domain logs are visible under Kibana, but that is not true for other child domains..
Turned out for me, there is some extra difference between the NXLog conf files... Where not worked, were there 3extra lines:
<Route 3>
Path Logs => FireEye
< /Route>
And unfortunately the FireEye IP address at < Output FireEye> section was not good, probably the service stopped behind the IP. And because of that, at the client side (from NXLog log) I got the following error message:
2018-04-24 18:40:13 ERROR couldn't connect to tcp socket on 5.6.7.8:5678; No connection could be made because the target machine actively refused it.
And that was the reason... Just deleted the 'Route 3' part from the config, and after a service restart instantly started to push the logs to Logstash. (Route 1 is always good, just skipped I think because Route 3 was there with different path, ehh...)
So, magnusbaeck, you pointed to the root issue for first, thank you for your help!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.