Loop through and translate nested object

Hi,

I'm pretty sure I need to use ruby to achive this but I don't really know anything about it.

I've done a translation with logstash already where I enrich an ID field that occurs once per document. What I need to do now though is add a name supplier to a nested array of data that could contain 1 or many repeating entries.

For example:

"customerID" : 123456,
"customerName": "John smith",
"Orders" : [
            {
              "orderId" : "123456",
              "dateTimeOrdered" : "2019-02-01T00:07:39Z",
              "orderedItems" : [
                {
                  "objectType" : 0,
                  "productId" : 1536                  
                },
                {
                  "objectType" : 0,
                  "productId" : 1529
                },
                {
                  "objectType" : 0,
                  "productId" : 1490
                },
                {
                  "objectType" : 0,
                  "productId" : 1535
                }
              ]
            }
          ]

And using translation filter with records that look like

'1536' : '{"name": "red ball", "supplier" : "Fred"}'
'1529' : '{"name" : "yellow rocket","supplier": "Steve"},
'1490' : '{"name" : "fire truck","supplier" : "Jim Bob"}'
'1535# : '{"name" : "squad car", "supplier" : "Jim Bob"}'

Get logstash to use an elasticsearch pulgin to collect that data, a tranlation filter using the above dictionary file to loop through each of the orders and then use the elasticserach output pluging to update the docuemtent to look like.

"customerID" : 123456,
"customerName": "John smith",
"Orders" : [
            {
              "orderId" : "123456",
              "dateTimeOrdered" : "2019-02-01T00:07:39Z",
              "orderedItems" : [
                {
                  "objectType" : 0,
                  "productId" : 1536,
                  "name" : "red ball",
                  "supplier" : "Fred"
                },
                {
                  "objectType" : 0,
                  "productId" : 1529,
                  "name" : "yellow rocket",
                  "supplier": "Steve"
                },
                {
                  "objectType" : 0,
                  "productId" : 1490,
                  "name" : "fire truck",
                  "supplier" : "Jim Bob"
                },
                {
                  "objectType" : 0,
                  "productId" : 1535,
                  "name" : "squad car"
                  "supplier" : "Jim Bob"
                }
              ]
            }
          ]

There would be other fields in the docuement but I've omited them for the sake of brevity.

I feel ruby comes into this somewhere but I don't know enough about it to understand where to start with it to make headway on this.

Many thanks for your help in advance
Kind regards
Ant

Take a look at the iterate_on option to translate.

    translate {
        iterate_on => "[Orders][0][orderedItems]"
        field => "productId"
        destination => "foo"
        dictionary_path => "/home/user/foo.yml"
    }

will get you

           "orderedItems" => [
            [0] {
                "objectType" => 0,
                 "productId" => 1536,
                       "foo" => "{\"name\": \"red ball\", \"supplier\" : \"Fred\"}"
            },
            [1] {
                "objectType" => 0,
                 "productId" => 1529,
                       "foo" => "{\"name\" : \"yellow rocket\",\"supplier\": \"Steve\"}"
            },

Note that if your yml file entries did not have single quotes on the RHS and just looked like

'1535' : {"name" : "squad car", "supplier" : "Jim Bob"}

Then you would instead get

               "orderedItems" => [
            [0] {
                "objectType" => 0,
                 "productId" => 1536,
                       "foo" => {
                        "name" => "red ball",
                    "supplier" => "Fred"
                }
            },
            [1] {
                "objectType" => 0,
                 "productId" => 1529,
                       "foo" => {
                        "name" => "yellow rocket",
                    "supplier" => "Steve"
                }
            },

In either case a fairly simple ruby filter can be used to adjust the field names.

1 Like

Perfect, had an issue as the documents had been assigned a type that wasn't doc so when it tried to write them as it writes as type "doc" it failed, I re-indexed a sample index with a type of "doc" to proof the process though and it worked a treat.

when I was doing translations without the itteration I used the result in single quotes as I could then do

 json {
        source => "translation"
        remove_field => ["translation"]
    }

on the filter stage which would allow

{
                  "objectType" : 0,
                  "productId" : 1490,
                  "name" : "fire truck",
                  "supplier" : "Jim Bob"
}

rather than

{
                  "objectType" : 0,
                  "productId" : 1490,
                  "foo" : { 
                                    "name" : "fire truck",
                                    "supplier" : "Jim Bob"
                  }
}

the json filter doesn't look to support itteration in the same way though so I can't do that in this situation. I realise I could create 2 dictionarys and then run the filter once for name and once for supplier to get the desired result but is there a more eligant way?

I cannot think of one :slight_smile:

It was worth checking :slight_smile:

I am very greatful for your insights though!

Just realised something, there are times when there is more than one order against a customer but the [0] would seem to limit it to only the first order in the array. I've tried
iterate_on => "[Orders][orderedItems]"
&
iterate_on => "[Orders][][orderedItems]"

but neither seem to work and I can't find the required syntax online, you wouldn't happen to know what magical character unlocks the behaviour I need?

I do not see any code in the filter that would allow it to iterate a pair of nested arrays. Depending on what you need to do with the events subsequently it might be viable to use a split filter to separate the orders into multiple events, at which point you could use translate as shown above.

If you have a known number of orderedItems (or a limit to them) then you could grit your teeth and

    translate {
        iterate_on => "[Orders]"
        field => "[orderedItems][0][productId]"
        destination => "[orderedItems][0][foo]"
    [...]
    }
    translate {
        iterate_on => "[Orders]"
        field => "[orderedItems][1][productId]"
        destination => "[orderedItems][1][foo]"
    [...]
    }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.