I am new to Elastic stack. I am trying to creating a log management system. And while searching I came across MDC. So, I want to know about MDC logs and how it is useful in respect of filebeat or logstash log parsing. How it is going to help me in maintain the ECS format. Please help me providing any information regarding this.
MDC stands for "Mapped Diagnostic Context". In general, it helps you to provide information to the logging framework which is not available at the location of the logging call. This is not a feature of Elastic Stack but of the respective logging framework.
Example:
You have an web application where a user logs in and can browse websites which get data from a database. In the web controller you do MDC.put("session",sessionid);
In you database access layer you create logs if an error occurs but the layer does not know which user initiated the call.
In the logfile you can still access the session ID as the value was stored in the MDC.
This helps you to group log entries together which belong together. You can see what actions the user did in the same session before the error occured so you might be able to reproduce it.
You can also log other user relevant data in the same way which helps you to fill the fields under user.* .
Other useCases are the apm agents: https://www.elastic.co/guide/en/apm/agent/index.html
They can add their IDs to the MDC so you can correlate the data between your log entries and the application performance monitoring.
Thanks @Wolfram_Haussig for replying, okay I get that MDC is used to group log entries together. I want to reduce the use of grok patterns in logstash and such that logs entry can be easily divided into fields without using grok patterns, and is MDC logging is somehow related to this?
You want to reduce the amount of Grok? This has nothing to do with MDC - on the contrary: With MDC you get more fields which would then have to be parsed by Grok!
Which Logging Framework do you use? Some Logging Frameworks support writing logs as JSON which can be easily parsed by LogStash without the use of Grok, so you could do something like this:
configure logging to write a json log file for easier import into Elastic
keep the existing text logfiles for administrators(Json logs are harder to read)
You may want to look into this new library as well. It's meant to be an adapter between your Java application and logging to the Elastic Stack in ECS format
I have tried , and it worked.
But there was a problem with the timestamps:
Instead of the actual exact time the event occurs in my java - app , in kibana i saw the timestamp the filebeat shipped the event to elasicsearch, which is some time later.
You want the original actual timestamp produced by your java - app used by elasticsearch, and not 2 different confusing timestamps ?
in your filebeat.yml (which reads the json-formated logfiles created by your log4j2/java app) :
processors:
- add_host_metadata: ~
- timestamp:
field: RFC3339Nano
layouts:
- '2006-01-02T15:04:05.999999999Z07:00' # RFC3339Nano from https://godoc.org/time#pkg-constants
test:
- '2019-09-21T14:37:25.552000000+02:00' #example from an actual log4j2-json-logfile, when using the JsonLayout decribed above
- drop_fields:
fields: [RFC3339Nano]
this took me some hours to figure this out, but now works great!
But what i still don't get is why sending the events to logstash (which send it to elasticsearch) instead directly sending to elasticsearch ? I'm currently doing the same for apparently no reason, except that the elasticsearch docu recommends so.
as i wrote in my other answer above : why bothering with logstash at all when using json logs? why not just let filebeat send the json-logs send to elasticsearch directly?
It's indeed possible to send to an Elasticsearch cluster directly. You will absolutely want to have security enabled on your Elasticsearch cluster, before doing that, of course.
And even when Filebeat is sending directly to Elasticsearch, you can still do server-side processing with Elasticsearch ingest processors, similar to what you're already doing with the Beats processors.
Here's a few resources on securing your ES cluster:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.