Parse Hostname from offline logs and then apply to all events

Hi,

I'm building an offline log parser using logstash, ES and Kibana. In this scenario the logs are provided in a zip from servers. They are placed in a volume mounted into the logstash container, then pushed into ES and viewed from Kibana.

What this means is that the original server that generated the logs is not the server sending the events to ES.

What I wondered is how can I get logstash to parse the hostname from the offline logs, and then apply this to all events?

I actually don't think this would be possible. I suspect logstash streams lines from the log file hence, if the hostname appeared in line 100, it would not be possible to go back and update the host value on preceding events.

Another idea which could work outside of logstash would be preprocessing the offline logs for the hostname. This raises the slight different question of how to add this externally parsed value into logstash?

Hi @sirReeall,

Firstly welcome to the community!

can I know if the zip file is created individually for each host?, if so add the host name to the zip file.

Then while reading the files in logstash create a filed with the metadata of the file name which defaults contains the hostname.

Regards
Dilip

Hey @kolli_dilip - Thank you for the warm welcome!

That's actually a really good idea, but unfortunately the log file names are generic for example, error.log, debug.log etc..

@sirReeall can you please share a sample log , will check for any idea if i can

Regards
Dilip

You could use an environment variable.

Sure no problems:

Here is a sample where you can see the host that is running the process is server01

2021-09-02 12:59:22.129+0000 INFO  [com.example.WebServer] Gracefully shutting down process
2021-09-02 12:59:22.129+0000 INFO  [com.example.WebServer] Shutdown comple
2021-09-02 13:21:11.452+0000 INFO  [com.example.WebServer] Web server starting up
        --- Server Start up ---
        Operating System: Windows Server 2012 R2; version: 6.3; arch: amd64; cpus: 4
        Process id: 4104@server01
2021-09-02 13:21:14.862+0000 INFO  [com.example.LocksFactories] Locking selected
2021-09-02 13:21:15.307+0000 INFO  [com.example.SIModule] ServerId{eb63dcec} (eb63dcec-5ac0-4a88-8ffe-c3efa1437fbc)
2021-09-02 13:21:16.844+0000 INFO  [com.example.WebServer] Running version 1.0

I can parse out the host using grok. That's not the issue. The challenge is adding this parsed host to all events prior and post.

That could be really useful! I'd have to set up a script to parse the log first add the environment variable and then pass the file to logstash :face_with_monocle:

I'll have to see if that would work.... (it does in my mind)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.