Sending Log4J logs (in XML) again

Hi,

This is a follow up to an earlier post on a similar topic.

I'm trying to send Log4j logs in XML format to Elasticsearch using Logstash.

My XML file is:

    <log4j:event logger="Common.Core.Sessions.SessionManager" level="INFO" timestamp="1567418641859" thread="8">
    	<log4j:message> Session fb9d3408-d370 created for user {9131559e-3b0b} at 127.0.0.1:4931</log4j:message>
    	<log4j:properties>
    		<log4j:data name="ConnectionId" value="0HLPFHJFNA3PK" />
    		<log4j:data name="RequestId" value="0HLPFHJFNA3PK:00000007" />
    	</log4j:properties>
    </log4j:event>

and logstash.conf file is:

input {
  beats {
   port => 5044
   type => "log"
  }
}

filter 
{
    xml 
    {
        source => "message"
        xpath => 
        [
            "/log4j:event/log4j:message/text()", "messageMES"
        ]
        store_xml => true
        target => "doc"
    }
}

output {
  elasticsearch {
    hosts => "localhost:9200"
	sniffing => true
    manage_template => false
	index => "%{[@metadata][beat]}-%{+yyyy.ww}"
    document_type => "%{[@metadata][type]}"
  }
}

My Filebeat config (partially) is:

filebeat.inputs:

- type: log

  enabled: true

  paths:
    - C:\ProgramData\LogTest\*.xml
    
  #Multiline options

  multiline.pattern: '^<log4j:event'

  multiline.negate: true

  multiline.match: after

The issue is that all the services start correctly and I do not see any errors in any log files, but the message that I've captured from the XML file ( "Session fb9d3408-d370 created for user {9131559e-3b0b} at 127.0.0.1:4931" ) does not show up in the Kibana logs.

Is there anything wrong in the configs? Or is there something else ?

Thanks in advance,
Jy

Hi @JY_DT,

could you elaborate on "does not show up"? They can not be found through the search API? Where do you expect what to show up?

Hi @weltenwort,

I was expecting it to show up in Kibana when I navigate to the logs. Like I see the logs for the regular log files (non-XML).

My understanding is that I should be able to see the message that I've captured from the XML file ( "Session fb9d3408-d370 created for user {9131559e-3b0b} at 127.0.0.1:4931" ). Isn't that correct?

Or do I need to search for that using something different?

In the logstash.conf file, I have this filter

xpath => 
[
    "/log4j:event/log4j:message/text()", "messageMES"
]
store_xml => true
target => "doc"

So, I was expecting the "messageMES" to show up in the Kibana log.
Thanks.

My main intention is to show fields from each event in the XML (<log4j:event...</log4j:event>) as a single line in the display.
For example, say some fields that I've selected should be displayed as:

"Session fb9d3408-d370 created for user {9131559e-3b0b} at 127.0.0.1:4931" "0HLPFHJFNA3PK" "0HLPFHJFNA3PK:00000007"

i.e. by selecting the message, ConnectionId and RequestId, and maybe some more fields.

Thanks.

So you want to use the "Logs" app? In that case there's a few things to keep in mind:

The Logs UI expects the events to adhere to the Elasic Common Schema (ECS). That means in particular that you should use the following destination fields when extracting information from your XML document:

  • @timestamp - date field
    ["/log4j:event/@timestamp", "@timestamp"]
  • message - text field
    ["/log4j:event/log4j:message/text()", "message"]
  • log.logger - keyword field
    ["/log4j:event/@logger", "[log][logger]"]
  • log.level - keyword field
    ["/log4j:event/@level", "[log][level]"]

It is also important that the mapping defines the correct field types.

Then you'll have to configure the Logs UI to access your indices. That can be done via the settings tab:

I was able to get the ingestion of your example doc to work with the following logs filter config:

filter {
  mutate {
    # inject log4j xml namespace
    gsub => [
      "message", "<log4j:event ", '<log4j:event xmlns:log4j="http://jakarta.apache.org/log4j/" '
    ]
  }
  xml {
    source => "message"
    store_xml => true
    force_array => false
    target => "log4j_event"
  }
  mutate {
    # convert to ECS
    copy => {
      "message" => "[log][original]"
    }
    copy => {
      "[log4j_event][message]" => "message"
    }
    copy => {
      "[log4j_event][logger]" => "[log][logger]"
    }
    copy => {
      "[log4j_event][level]" => "[log][level]"
    }
    convert => {
      "[log4j_event][timestamp]" => "integer"
    }
  }
  date {
    match => [
      "[log4j_event][timestamp]", "UNIX_MS"
    ]
  }
}

and an index template like

{
  "_routing": {
    "required": false
  },
  "numeric_detection": false,
  "dynamic_date_formats": [
    "strict_date_optional_time",
    "yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"
  ],
  "_meta": {},
  "_source": {
    "excludes": [],
    "includes": [],
    "enabled": true
  },
  "dynamic": true,
  "date_detection": true,
  "properties": {
    "@timestamp": {
      "type": "date"
    },
    "log": {
      "type": "object",
      "properties": {
        "level": {
          "type": "keyword"
        },
        "logger": {
          "type": "keyword"
        }
      }
    },
    "log4j_event": {
      "type": "object",
      "properties": {
        "properties": {
          "type": "object",
          "properties": {
            "data": {
              "type": "nested",
              "properties": {
                "name": {
                  "type": "keyword"
                },
                "value": {
                  "type": "keyword"
                }
              }
            }
          }
        }
      }
    },
    "message": {
      "type": "text"
    }
  }
}

Maybe that helps you a bit. You might have to adapt the index name patterns to your setup, of course.

Hi @weltenwort,

Thanks for a detailed answer. Yes, I want to use the Logs app.

However, I'm not very familiar with the index templates and am looking through the docs to figure out more.

If I understand correctly:

  1. I need to create an index template with what you've shown going into the "mappings".
    And I can mention some custom patterns like "MyPattern" in the "index_patterns" section.

  2. Create an index matching the template. How do I do this? (say with a name "MyPatternName")

  3. In logstash.conf, I can use the name of this index, for example:

    output {
    elasticsearch {
    hosts => "localhost:9200"
    sniffing => true
    manage_template => false
    index => "MyPatternName"
    document_type => "%{[@metadata][type]}"
    }

I think I've understood the rest.

Thanks again.

When a client (like Logstash) wants to write documents to an index which doesn't exist, Elasticsearch will by default create that index on demand. Index templates are a mechanism by which you can tell Elasticsearch to use a specific mapping and specific settings for new indices whose name match a certain pattern (the index_patterns key of the index template). These patterns can contain wildcards (like *) in order to match a family of indices whose name has a dynamic part.

This dynamic part is often used to partition the indices via encoding a date in the index name like you did in your initial Logstash config snippet: index => "%{[@metadata][beat]}-%{+yyyy.ww}". In your case that means that Logstash will write the documents to indices named filebeat-${year}.${week}, so the index template could try to match the pattern filebeat-*.

(BTW, the document_type setting in the logstash output config is deprecated and should be avoided if you're running an Elasticsearch version of 6.0 or newer.)

My recommendation would be to pick a naming scheme for your indices that represents the problem domain you are working within. That could be something like server-logs-%{+YYYY.MM.dd} in the Logstash output config and server-logs-* in the index template index_patterns setting. You can create the index templates via the UI starting from Kibana version 7.4 on or via the Elasticsearch PUT _template HTTP API.

For the Logs UI to work correctly the presence of the @timestamp date field and the message text field is most important. That's why I copied the values to these fields in the example Logstash pipeline.

OK, so I've created an Index Template, but the index is not getting created automatically.

Name of Index template: filebeatlog4j
Index patterns: filebeatlog4j-*

Please see attachments.
I've performed these steps:

  1. Created an Index Template in Kibana named "filebeatlog4j" and with Index patterns "filebeatlog4j-*" . In the Index template creation, I've placed your config in the "mappings" section, as shown in the attachment. The "settings" does not contain anything as shown.

  2. Configured LogUI to access the new indices like you've shown.

  3. Added to the filter section of logstash.conf what you've shared.

  4. Restarted all services, including the Filebeat service running on a client machine from where the XML file will be parsed and sent to Logstash.

But I'm not able to see the output. What am I doing wrong?

Templates
Thanks.

Complete logstash.conf file:

input {
  beats {
   port => 5044
   type => "log"
  }
}

filter {
  mutate {
    # inject log4j xml namespace
    gsub => [
      "message", "<log4j:event ", '<log4j:event xmlns:log4j="http://jakarta.apache.org/log4j/" '
    ]
  }
  xml {
    source => "message"
    store_xml => true
    force_array => false
    target => "log4j_event"
  }
  mutate {
    # convert to ECS
    copy => {
      "message" => "[log][original]"
    }
    copy => {
      "[log4j_event][message]" => "message"
    }
    copy => {
      "[log4j_event][logger]" => "[log][logger]"
    }
    copy => {
      "[log4j_event][level]" => "[log][level]"
    }
    convert => {
      "[log4j_event][timestamp]" => "integer"
    }
  }
  date {
    match => [
      "[log4j_event][timestamp]", "UNIX_MS"
    ]
  }
}

output {
  elasticsearch {
    hosts => "localhost:9200"
	sniffing => true
    manage_template => false
	index => "%{[@metadata][beat]}log4j-%{+yyyy.ww}"
  }
}

Thanks for sharing these details. I don't see anything immediately wrong. Some avenues of investigation could be:

  • Can Logstash connect to Elasticsearch successfully?
  • Does the Elasticsearch log contain anything about indexing errors?

Oh sorry, I had made a trivial mistake. The Logstash/ Elasticsearch server uses a dynamic IP Address and it had changed. I've updated Filebeat.yml with the new IP Address and now everything works fine.

Some things I need to clarify though:

Here, I see the timestamp, the log level and the message.
But we have also configured to send the logger information from the XML log file,

logger="Common.Core.Sessions.SessionManager"

How can I see this in the Logs UI?

Also, if I modify the configuration to show say the "ConnectionId" and "RequestId" from my XML file, how would I be able to see them in the Logs UI?

Thanks again.

You should be able to modify the columns displayed in the Logs UI at the bottom of the settings tab:

To view the logger you could try adding the log.logger field.

Seeing how the ConnectionId and RequestId are both encoded in the XML as Data elements we would probably have to transform them into proper key-value pairs in Logstash. I'll have to experiment a bit with that myself. :grimacing:

My ruby is not the best, but a filter like

filter {
  ruby {
    code => "event.set('[data]', Hash[event.get('[log4j_event][properties][data]').map { |item| [item['name'], item['value']] }])"
  }
}

should add a data property to the event that contains the data key-value pairs as an object. That means that a mapping for this given your example XML should look something like

    "data": {
      "type": "object",
      "properties": {
        "ConnectionId": {
          "type": "keyword"
        },
        "RequestId": {
          "type": "keyword"
        }
      }
    },

which could then be used to filter the logs stream in the UI or add data.ConnectionId and data.RequestId as columns.

Sure, thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.