We have application logs coming in from a number of different hosts (shipped with filebeat) and have obvserved a mixing of datastreams for one of the log types.
We changed the logstash input.config from version A to B which appears to have resolved the issue. We are using ELK 7.10.
A)
input {
beats {
client_inactivity_timeout => 1200
port => 5044
codec => line {
charset => "ISO-8859-1"
}
}
}
Sure, we have 2 different application log file types coming into separate daily logstash indices, one for server and one for the various services. Each with their own input entry in filebeat.yml and separate .conf filter file.
We observed from Kibana log lines from the server log consistently appearing in one of the service log fields though this stopped after making the changes to the logstash input conf (version B).
There is no crossover between the two different log types or shared processes etc so it was odd seeing server log lines appear in the service logs in kibana.
Ok, so if we are sending multiline application logs the multiline aspect of this should be configured in filebeat.yml and we don´t need any additional conf for the logstash input other than what is already in version B?
I guess a better question would have been, is the config we are currently running best practice? is there are recommended logstash input codec for setup such as this.
Also, will we be able to fix (remove the server log lines from the service log indices) by just reindexing the previous daily log indices.
You would need to share your entire filebeat.yml file and also your entire conf file to help understand what happened, if you do, please use the preformatted text option, the </> button.
It is pretty hard to understand configuration when it is not properly formatted.
But per default, different conf files in logstash does not mean that they are independent from each other, if you do not configure multiple pipelines, then all the conf files inside /etc/logstash/conf.d will be merged as one configuration.
If you have multiline logs and are using filebeat, then the configuration should be done in Filebeat side, not Logstash.
The beats inputs is pretty simple, in most of the cases you just need the port and that's it.
It depends on your case, you didn't provide enough context, the beats always send data using json and with the UTF-8 codec, so there is no need to change the charset, if you want you can use the json codec in the input to parse the data directly in the input, but you can also do the same thing using a json filter in the filter block.
I prefer to leave the codec as the default, which is the plain codec, and do the parsing in the filter block.
output }
if [log_type] == "Server_log_files" {
if "elastic" in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
index => ["logstash-srv-%{+YYYY.MM.dd}"]
document_id => "%{fingerprint}"
}
}
stdout {codec =>rubydebug}
}
if [log_type] == "DI_log_files" {
if "elastic" in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
index => ["logstash-di-%{+YYYY.MM.dd}"]
document_id => "%{fingerprint}"
}
}
stdout {codec =>rubydebug}
}
}
when you say its default do I have to define codec => plain (or similar) in the input.conf or this doesn´t just setting defining beats and the port will cover this.
Multiple pipeline are used when you don't to avoid the risk of the data from twow or more different inputs to mix up as using multiple pipelines will run each pipeline independently from each other.
If you do not use multiple pipelines Logstash will merge all the configuration files in the configuration path.
But in your case this is not the issue, your Filebeat is configured to send logs to Logstash on port 5044, and you can have only one input listening on this port.
Also, you have conditionals in your configuration, I don't see how your logs would be mixed in the output.
You are correctly adding a field for each input in filebeat and using this field to filter the logs in your Logstash, this is already the right approach.
Are you still experiencing mixed logs?
For the beats input you just need to set the port, you do not need to set the codec as plain because this is already the default codec.
You will however need a json filter in your logstash pipeline if you do not set the beats codec to json.
Just to clarify, our issue was resolved after making the changes to our logstash input.conf (removing line codec and charset). So all good on that front.
Is it only possible to have multiple logstash pipelines if logstash is pulling the files statically from disk/network share?
How would I apply this to our current setup and or is it necessary ? It looks as though the logs are coming as hoped
Do I just have to adjust output.conf
I am curious about the different logstash logs for logstash itself, if I want to see any errors or just how things are going in general when each log is read/ingested/sent to elasticsearch how do I see this?
Looking through logstash-plain only shows me initial logstashs initial connection to elasticsearch and thats about it.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.