Split of JSON array into multiple events in Kibana

Hi,

I am trying to split into different events (logs) this log schema:

        "_index": "cocacola",
        "_type": "raw",
        "_id": "CqwJ63MBEQS11DmXsDyRZl",
        "_score": 1,
        "_source": {
          "messageId": 33297111461,
          "eventType": "EntityUpdated",
          "username": "Administrator",
          "timeStamp": "2020-08-13T16:59:50.87Z",
          "data": {
            "name": "Target:1111111",
            "definition": "Target",
            "is_new": null,
            "user_id": null,
            "usergroup_id": null,
            "rules": null,
            "property_changes": [
              {
                "property": "Target.Completed",
                "data_type": "System.DateTime",
                "value": {
                  "original": null,
                  "new": "2020-08-13T16:59:50.8491889Z"
                }
              },
              {
                "property": "Target.State",
                "data_type": "System.String",
                "value": {
                  "original": "good",
                  "new": "very good."
                }
              },
              {
                "property": "Target.State",
                "data_type": "System.String",
                "value": {
                  "original": "Processing_Completed",
                  "new": "Completed"
                }
              }

into something like this:

       "_index": "cocacola",
        "_type": "raw",
        "_id": "CqwJ63MBEQSDmXsDyRZl",
        "_score": 1,
        "_source": {
          "messageId": 33291117461,
          "eventType": "EntityUpdated",
          "username": "Administrator",
          "timeStamp": "2020-08-13T16:59:50.87Z",
          "data": {
            "name": "Target:1111111",
            "definition": "Target",
            "is_new": null,
            "user_id": null,
            "usergroup_id": null,
            "rules": null,
            "property_changes": [
              {
                "property": "Target.Completed",
                "data_type": "System.DateTime",
                "value": {
                  "original": null,
                  "new": "2020-08-13T16:59:50.8491889Z"
                }
              }

and this:

        "_index": "cocacola",
        "_type": "raw",
        "_id": "CqwJ63MBEQSDmXsDyRZl",
        "_score": 1,
        "_source": {
          "messageId": 3329111461,
          "eventType": "EntityUpdated",
          "username": "Administrator",
          "timeStamp": "2020-08-13T16:59:50.87Z",
          "data": {
            "name": "Target:1111111",
            "definition": "Target",
            "is_new": null,
            "user_id": null,
            "usergroup_id": null,
            "rules": null,
            "property_changes": [
{
                "property": "Target.State",
                "data_type": "System.String",
                "value": {
                  "original": "good",
                  "new": "very good."
                }
              }

I am using this but it doesn't work:

input { 
  elasticsearch {
    hosts => ["https://xxxxxxxxxxx"]
    index => "xxxxxxxxx"
    user => "lxxxxxxxxxxxx"
    password => "xxxxxxxxxxx"
    query => '{ "qxxxxxxxxxxx" }}}'
  }
}

filter {
  json {
    source => "message"
  }
  split {
    field => "[data][property_changes]"
  }
   mutate {
    add_field => {
      "[user][name]" => "%{[username]}"
      "[event][id]" => "%{[messageId]}"
      "[event][type]" => "%{[eventType]}"
      "[event][action]" => "%{[data][property_changes][property]}"
    }
    remove_field => ["message"]
   }
}

output {
  elasticsearch {
  hosts => "https://xxxxxxxxxxxxxxxxxxx.xxxxxxxxxx"
  index => "cxxxxxxxxxxx}"
  user => "xxxxxxxxxxx"
  password => "xxxxxxxxxx"
  }
}

I have been checking multiple threads, some of them have a similar case like mine but it doesn't work when I try to adapt it to my situation. I tried something similar to this: Split nested json array

Maybe you can help @magnusbaeck since I have seen you in many threads. Could it work with a ruby plugin?

Please refrain from pinging folks directly, this is a forum and anyone that participates might be able to assist you.

BTW I moved your question to #elastic-stack:logstash

What do you mean by that?

Hi @Badger,

I meant that when I try to use what you see in that thread adjusted to my code, it doesn't split the "property_changes" as expected.

Having in mind my initial log I would expect this:

1 log

       "_index": "cocacola",
        "_type": "raw",
        "_id": "CqwJ63MBEQS11DmXsDyRZl",
        "_score": 1,
        "_source": {
          "messageId": 33297111461,
          "eventType": "EntityUpdated",
          "username": "Administrator",
          "timeStamp": "2020-08-13T16:59:50.87Z",
          "data": {
            "name": "Target:1111111",
            "definition": "Target",
            "is_new": null,
            "user_id": null,
            "usergroup_id": null,
            "rules": null,
            "property_changes": [
              {
                "property": "Target.Completed",
                "data_type": "System.DateTime",
                "value": {
                  "original": null,
                  "new": "2020-08-13T16:59:50.8491889Z"
                }
              }

2 log

       "_index": "cocacola",
        "_type": "raw",
        "_id": "CqwJ63MBEQS11DmXsDyRZl",
        "_score": 1,
        "_source": {
          "messageId": 33297111461,
          "eventType": "EntityUpdated",
          "username": "Administrator",
          "timeStamp": "2020-08-13T16:59:50.87Z",
          "data": {
            "name": "Target:1111111",
            "definition": "Target",
            "is_new": null,
            "user_id": null,
            "usergroup_id": null,
            "rules": null,
            "property_changes": [
              {
                "property": "Target.State",
                "data_type": "System.String",
                "value": {
                  "original": "Processing_Completed",
                  "new": "Completed"
                }
              }

3 log

       "_index": "cocacola",
        "_type": "raw",
        "_id": "CqwJ63MBEQS11DmXsDyRZl",
        "_score": 1,
        "_source": {
          "messageId": 33297111461,
          "eventType": "EntityUpdated",
          "username": "Administrator",
          "timeStamp": "2020-08-13T16:59:50.87Z",
          "data": {
            "name": "Target:1111111",
            "definition": "Target",
            "is_new": null,
            "user_id": null,
            "usergroup_id": null,
            "rules": null,
            "property_changes": [
              {
                "property": "Target.State",
                "data_type": "System.String",
                "value": {
                  "original": "Processing_Completed",
                  "new": "Completed"
                }
              }

and I end up having only the number 1 and 3. The second log is not shown in Kibana.

If some events are being indexed but others are not then I would suspect a mapping exception. For example, if dynamic mapping decided that a particular field should be a date then any event where it was not a date would be rejected, and logstash would log an exception. I do not think that would happen for "[property_changes][value][new]" being "2020-08-13T16:59:50.8491889Z" because that is not a date format that elasticsearch would auto-detect, but it is worth looking at the logstash logs.

1 Like

You are right @Badger. That should be the problem. Before making any change in the mapping, how can I set up a determined field to contain multiple data types? For example, "value.new" can contain a string sometimes or it can be a date.

Thanks for your feedback. Really helpful.

A field in elasticsearch can only have one type. value.new would have to be a string. If dynamic mapping gets that wrong you would have to use a template to set the type.

The limitation I have is that sometimes this field "value.new" contains a date or a string or a number so according to what I have been reading I cannot do too much on adjusting it to each data type.

Thank you @badger.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.