Help requested to iterate and join sub-arrays

Hi everyone,

Please forgive me for my noob question, of it it has already been answered, but I have not been able to find it by myself.

Let's consider that I have this datasource, with an arrays of vars, which may contain arbitrary content which can be of type array or string:

    "nodelog": {
        "body": {
            "email": "xxxxxxxxxxxx@gmail.com",
            "template": "Alertxxxxxxxxxx",
            "vars": [{
                    "name": "users",
                    "content": [{
                            "item1": "value1",
                            "item2": "value2", 
                            "user_id": "daba4a95-c585-4001-9195-351fd859914b",
                            "value_id": "154770",
                            "last_change": "2024-01-25T14:30:49.382Z"
                        }
                    ],
                    [{
                            "item1": "value10",
                            "item2": "value20", 
                            "user_id": "54678945-c585-4001-9195-351fd859914b",
                            "value_id": "654789",
                            "last_change": "2024-01-22T12:30:40.123Z"
                        }
                    ]
                }, {
                    "name": "CRON_JOB",
                    "content": "control-users"
                }, {
                    "name": "GENERIC_TEXT",
                    "content": "lorem ipsum..."
                }
            ]
        },
    },

What I am looking for is to iterate through any found item of vars and systematically convert each content to a string representation.

For the example above, I am trying to build something like this using logstash pipeline:

    "nodelog": {
        "body": {
            "email": "xxxxxxxxxxxx@gmail.com",
            "template": "Alertxxxxxxxxxx",
            "vars": [{
                    "name": "users",
                    "content": "{ \"item1\": \"value1\", \"item2\": \"value2\",  \"user_id\": \"daba4a95-c585-4001-9195-351fd859914b\", \"value_id\": \"154770\", \"last_change\": \"2024-01-25T14:30:49.382Z\" },{ \"item1\": \"value10\", \"item2\": \"value20\",  \"user_id\": \"value_id\": \"654789\", \"last_change\": \"2024-01-22T12:30:40.123Z\" }"
                }, {
                    "name": "CRON_JOB",
                    "content": "control-users"
                }, {
                    "name": "GENERIC_TEXT",
                    "content": "lorem ipsum..."
                }
            ]
        },
    },

I guess that I should use a mutate filter, with maybe the join operation, but I don't know if I have to use a block of ruby code (which I never used before) to iterate though the vars array, or if there is a simpler way to address all the content values in one way.

Would that be possible to write something like this, even if vars is an array?

    mutate {
      join => { "[nodelog][body][vars][content]"  => "," }
    }

Any help or guideline would be greatly appreciated.

Thanks in advance
Louis

Your JSON is not valid. "content" cannot be a list of arrays. It could be an array of arrays. It could be an array of hashes. You need to show us valid JSON that you want to reconfigure.

If it is an array of hashes then you could try

    ruby {
        code => '
            vars = event.get("[nodelog][body][vars]")
            if vars.respond_to? "each_index"
                vars.each_index { |x|
                    if vars[x]["content"].respond_to? "each_index"
                        newContent = ""
                        vars[x]["content"].each_index { |y|
                            newContent += vars[x]["content"][y].to_s + ","
                        }
                        newContent.delete_suffix!(",")

                        event.set("[nodelog][body][vars][#{x}][content]", newContent)
                    end
                }
            end
        '
    }

Hi Badger,

Ah, yes, you're right.
I am sorry, it is a copy/paste mistake.

Here is the right initial content:

    "nodelog": {
        "body": {
            "email": "xxxxxxxxxxxx@gmail.com",
            "template": "Alertxxxxxxxxxx",
            "vars": [{
                    "name": "users",
                    "content": [{
                            "item1": "value1",
                            "item2": "value2", 
                            "user_id": "daba4a95-c585-4001-9195-351fd859914b",
                            "value_id": "154770",
                            "last_change": "2024-01-25T14:30:49.382Z"
                        },
                        {
                            "item1": "value10",
                            "item2": "value20", 
                            "user_id": "54678945-c585-4001-9195-351fd859914b",
                            "value_id": "654789",
                            "last_change": "2024-01-22T12:30:40.123Z"
                        }
                    ]
                }, {
                    "name": "CRON_JOB",
                    "content": "control-users"
                }, {
                    "name": "GENERIC_TEXT",
                    "content": "lorem ipsum..."
                }
            ]
        },
    },

Thanks for pointing this.
Louis-Marie

Any help please?

Thanks in advance

What you posted is still not valid standalone JSON. If I use

{ "nodelog": {
    "body": {
        "email": "xxxxxxxxxxxx@gmail.com",
        "template": "Alertxxxxxxxxxx",
        "vars": [{
                "name": "users",
                "content": [{
                        "item1": "value1",
                        "item2": "value2",
                        "user_id": "daba4a95-c585-4001-9195-351fd859914b",
                        "value_id": "154770",
                        "last_change": "2024-01-25T14:30:49.382Z"
                    },
                    {
                        "item1": "value10",
                        "item2": "value20",
                        "user_id": "54678945-c585-4001-9195-351fd859914b",
                        "value_id": "654789",
                        "last_change": "2024-01-22T12:30:40.123Z"
                    }
                ]
            }, {
                "name": "CRON_JOB",
                "content": "control-users"
            }, {
                "name": "GENERIC_TEXT",
                "content": "lorem ipsum..."
            }
        ]
    }
}}

and run the ruby filter I posted then it produces what you asked for.