I'm not aware of filebeat being able to parse XML. My recommendation would be to configure log4j to log with the JSON layout, which can easily parsed by Filebeat and directly forwarded to Elasticsearch.
If you could provide some details about your logging environment we might be able to give some more specific advice.
Thanks for your quick response. I will look into the JSON part.
However, we have an environment where logs are generated in XML format. So, if I use Logstash, maybe I could parse the XML. Is that possible? If yes, is there any documentation I can refer to? Preferably related to Log4j XML format.
Logstash can indeed parse XML fragments using the XML filter plugin. If the log format is not something you can change you could have...
filebeat ship the log events to logstash, which parses the XML and forwards the JSON to Elasticsearch or
logstash read the log file directly, parse the XML and forward the JSON to Elasticsearch
Using filebeat to ship the logs involves more moving parts, but it is optimized to be deployed on edge machines. This would be my recommendation to ingest XML-based logs.
I've been searching for a good example for this, but am unable to find one. As per your recommendation, I would like to use Filebeat to ship the log events to logstash, which parses the XML and sends to Elasticsearch. Could you please refer me to any such XML sample?
I'm looking for the configurations in Filebeat and Logstash.
Unfortunately I'm not aware of a comprehensive example that matches your description. I can try to point you at the relevant places in the docs though:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.