Parse/process json array with filebeat

andrejcoliveira · January 19, 2024, 3:45pm

Hi everyone, at my company we're trying to load our Cucumber logs to Elastic. Currently i want to check if it is possible to do it just using filebeat. The main problem to execute this task is that our logs are single line json object arrays like:

[{json object}, {json_object}]

At this moment, my configuration only creates one event with all json code in "message" field.

Thanks in advance for your help

strawgate · January 22, 2024, 6:19pm

Are you hoping that each json object becomes its own document?

If so, I don't believe that Filebeat can open a JSON array and split the objects into multiple documents. You'd like want to preprocess the log or use something like Logstash to split it into multiple documents.

andrejcoliveira · January 23, 2024, 10:55am

I would like have each json object from the array as an event. In the same document but with different identifiers (timestamp, id, etc). I've tested to manually put one object per line in the log file and it worked as expected. For one log, it created distinct events in the same document. I want to check if Filebeat can identify the distinct objects if the log have just one line ( + newline)

strawgate · January 23, 2024, 2:06pm

Yeah sorry, I was using "document" to mean event in elasticsearch.

You may be able to play with the delimiter settings on the log input but it's not JSON aware so you run the risk of a couple of issues.

But I'm not aware of a way you could have a json array on one line and have filebeat split each object into multiple event for Elasticsearch.

andrejcoliveira · January 23, 2024, 3:27pm

Do you think that logstash would be better to do this type of processing?

strawgate · January 23, 2024, 3:59pm

Yes, this is relatively easy to do in logstash.

Depending on the format of the file you may be able to have Filebeat read the file and use a logstash pipeline to do the splitting.

andrejcoliveira · January 23, 2024, 4:17pm

Can you give me an example of how to do it? I don´t know if you´re familiar with Cucumber log format. But basically, each object from the array represents a feature, each feature have several scenarios in the "elements" field. I want to know if it is possible to create individual events to each scenario so it isn't direct, it´s an array inside another array.

strawgate · January 23, 2024, 4:22pm

I am not familiar with a Cucumber log but here is the documentation for the split filter in Logstash that you'd likely use for this Split filter plugin | Logstash Reference [8.12] | Elastic

{ field1: ...,
 results: [
   { result ... },
   { result ... },
   { result ... },
   ...
] }

The split filter can be used on the above data to create separate events for each value of results field

filter {
 split {
   field => "results"
 }
}

Logstash would then split the one event into three independent events to write to Elasticsearch

event 1: { result ... }
event 2: { result ... }
event 3: { result ... }

Would you be able to share a sanitized cucumber log in this thread?

andrejcoliveira · January 23, 2024, 4:32pm

Sure:

[
  {
    "line": 2,
    "elements": [
      {
        "line": 4,
        "name": "",
        "description": "",
        "type": "",
        "keyword": "",
        "steps": [
          {
            "result": {
              "duration": "",
              "status": ""
            },
            "line": 5,
            "name": "",
            "match": {
              "arguments": [
                {
                  "val": "",
                  "offset": 2
                }
              ],
              "location": ""
            },
            "keyword": " "
          }
        ]
      },
      {
        "start_timestamp": "",
        "before": [
          {
            "result": {
              "duration": 561992900,
              "status": ""
            },
            "match": {
              "location": ""
            }
          }
        ],
        "line": 8,
        "name": "",
        "description": "",
        "id": "",
        "after": [
          {
            "result": {
              "duration": 1327737100,
              "status": ""
            },
            "match": {
              "location": ""
            }
          }
        ],
        "type": "",
        "keyword": "",
        "steps": [
          {
            "result": {
              "duration": 2546265500,
              "status": ""
            },
            "line": 9,
            "name": "",
            "match": {
              "location": ""
            },
            "keyword": " "
          },
          {
            "embeddings": [
              {
                "data": "",
                "mime_type": "",
                "name": ""
              }
            ],
            "result": {
              "duration": 8510044800,
              "status": ""
            },
            "line": 10,
            "name": "",
            "match": {
              "arguments": [
                {
                  "val": "",
                  "offset": 25
                },
                {
                  "val": "",
                  "offset": 44
                },
                {
                  "val": "",
                  "offset": 52
                }
              ],
              "location": ""
            },
            "keyword": " "
          },
          {
            "result": {
              "duration": 24000600,
              "status": ""
            },
            "line": 11,
            "name": "",
            "match": {
              "location": ""
            },
            "keyword": " "
          }
        ],
        "tags": [
          {
            "name": ""
          },
          {
            "name": ""
          },
          {
            "name": ""
          },
          {
            "name": ""
          },
          {
            "name": ""
          }
        ]
      },
      {
        "line": 4,
        "name": "",
        "description": "",
        "type": "",
        "keyword": "",
        "steps": [
          {
            "result": {
              "duration": 853000700,
              "status": ""
            },
            "line": 5,
            "name": "",
            "match": {
              "arguments": [
                {
                  "val": "",
                  "offset": 2
                }
              ],
              "location": ""
            },
            "keyword": ""
          }
        ]
      },
      {
        "start_timestamp": "",
        "before": [
          {
            "result": {
              "duration": 1000500,
              "status": ""
            },
            "match": {
              "location": ""
            }
          }
        ],
        "line": 15,
        "name": "",
        "description": "",
        "id": "",
        "after": [
          {
            "result": {
              "duration": 128004000,
              "status": ""
            },
            "match": {
              "location": ""
            }
          }
        ],
        "type": "",
        "keyword": "",
        "steps": [
          {
            "result": {
              "duration": 1852995400,
              "status": ""
            },
            "line": 16,
            "name": "",
            "match": {
              "location": ""
            },
            "keyword": " "
          },
          {
            "embeddings": [
              {
                "data": "",
                "mime_type": "",
                "name": ""
              }
            ],
            "result": {
              "duration": 3733347800,
              "status": ""
            },
            "line": 17,
            "name": "",
            "match": {
              "arguments": [
                {
                  "val": "",
                  "offset": 25
                },
                {
                  "val": "",
                  "offset": 44
                },
                {
                  "val": "",
                  "offset": 52
                }
              ],
              "location": ""
            },
            "keyword": " "
          },
          {
            "result": {
              "duration": 9997000,
              "status": ""
            },
            "line": 18,
            "name": "",
            "match": {
              "location": ""
            },
            "keyword": " "
          }
        ],
        "tags": [
          {
            "name": ""
          },
          {
            "name": ""
          },
          {
            "name": ""
          },
          {
            "name": ""
          },
          {
            "name": ""
          }
        ]
      }
    ],
    "name": "",
    "description": "",
    "id": "",
    "keyword": "",
    "uri": "",
    "tags": [
      {
        "name": "",
        "type": "",
        "location": {
          "line": 1,
          "column": 1
        }
      },
      {
        "name": "",
        "type": "",
        "location": {
          "line": 1,
          "column": 8
        }
      }
    ]
  }
]

Sorry if it is too long but my json log has more than 17000 lines, this is just a sample of one feature. What i want is one event for each element of each json object in the array.

strawgate · January 23, 2024, 4:42pm

And the actual format of the file is that this is all on one line without any line breaks? or is it pretty printed like this? or something in between?

Could you maybe share a screenshot of the file with word wrap off, in its original format, to give me a better idea?

andrejcoliveira · January 23, 2024, 4:48pm

[ { "line": 2, "elements": [ { "line": 4, "name": "", "description": "", "type": "", "keyword": "", "steps": [ { "result": { "duration": "", "status": "" }, "line": 5, "name": "", "match": { "arguments": [ { "val": "", "offset": 2 } ], "location": "" }, "keyword": " " } ] }, { "start_timestamp": "", "before": [ { "result": { "duration": 561992900, "status": "" }, "match": { "location": "" } } ], "line": 8, "name": "", "description": "", "id": "", "after": [ { "result": { "duration": 1327737100, "status": "" }, "match": { "location": "" } } ], "type": "", "keyword": "", "steps": [ { "result": { "duration": 2546265500, "status": "" }, "line": 9, "name": "", "match": { "location": "" }, "keyword": " " }, { "embeddings": [ { "data": "", "mime_type": "", "name": "" } ], "result": { "duration": 8510044800, "status": "" }, "line": 10, "name": "", "match": { "arguments": [ { "val": "", "offset": 25 }, { "val": "", "offset": 44 }, { "val": "", "offset": 52 } ], "location": "" }, "keyword": " " }, { "result": { "duration": 24000600, "status": "" }, "line": 11, "name": "", "match": { "location": "" }, "keyword": " " } ], "tags": [ { "name": "" }, { "name": "" }, { "name": "" }, { "name": "" }, { "name": "" } ] }, { "line": 4, "name": "", "description": "", "type": "", "keyword": "", "steps": [ { "result": { "duration": 853000700, "status": "" }, "line": 5, "name": "", "match": { "arguments": [ { "val": "", "offset": 2 } ], "location": "" }, "keyword": "" } ] }, { "start_timestamp": "", "before": [ { "result": { "duration": 1000500, "status": "" }, "match": { "location": "" } } ], "line": 15, "name": "", "description": "", "id": "", "after": [ { "result": { "duration": 128004000, "status": "" }, "match": { "location": "" } } ], "type": "", "keyword": "", "steps": [ { "result": { "duration": 1852995400, "status": "" }, "line": 16, "name": "", "match": { "location": "" }, "keyword": " " }, { "embeddings": [ { "data": "", "mime_type": "", "name": "" } ], "result": { "duration": 3733347800, "status": "" }, "line": 17, "name": "", "match": { "arguments": [ { "val": "", "offset": 25 }, { "val": "", "offset": 44 }, { "val": "", "offset": 52 } ], "location": "" }, "keyword": " " }, { "result": { "duration": 9997000, "status": "" }, "line": 18, "name": "", "match": { "location": "" }, "keyword": " " } ], "tags": [ { "name": "" }, { "name": "" }, { "name": "" }, { "name": "" }, { "name": "" } ] } ], "name": "", "description": "", "id": "", "keyword": "", "uri": "", "tags": [ { "name": "", "type": "", "location": { "line": 1, "column": 1 } }, { "name": "", "type": "", "location": { "line": 1, "column": 8 } } ] } ]

This is a sample of the log in its original form. As i said before, one line of information and one blank line.

system · February 20, 2024, 6:48pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.