Processing Multiple Rows From JDBC Streaming

I configured it to use Split...but it looks like that's going to produce unfavorable results. The test record I've done it on only have one question, but it appears as though it will end up creating an array. Is it possible to set the target using split to something like target => "[question][%{questionid}]"? The intent would be to stick all the fields in a separate object under each question id...so like:

questionid2.integeranswer
questionid2.dateanswer
questionid2.textanswer
questionid2.decimalanswer
questionid2.booleananswer
questionid4.integeranswer
questionid4.dateanswer
questionid4.textanswer
questionid4.decimalanswer
questionid4.booleananswer

No, but you can reformat the array of hashes and change

 [ { "questionid": 2, "integeranswer": null, ... },
{ "questionid": 4, "integeranswer": null, ... } ]

into

[ { "questionid2": { "integeranswer": null, ... } },
{ "questionid4": {"integeranswer": null, ... } } ]

using a ruby filter.

ruby {
    code => '
        data = event.get("jdbcOutput")
        if data.is_a? Array
            newData = []
            data.each { |h|
                q = h.delete("questionid")
                newData << { "question#{q}" => h }
            }
            event.set("jdbcOutput", newData)
        end
    '
}

then split that.

At least I was researching the right path, using ruby to manipulate the document. This is well and above anything I remotely understand with Ruby. Was I supposed to modify it, like the event.get from jdbcOutput to my actual field name, or just copy/paste? I copy/pasted it and nothing happens....no errors, no change in field structures.

You would need to update the event.get and .set calls to use the name of the field that your jdbc_streaming filter fetched data into.

Alright, that does the bulk of the work thank you. However, when splitting the fields out, they're all under question so the fields look like question.question2.integeranswer Is there a modification to this Ruby code to make it just question2.integeranswer or do I need to throw in an additional ruby script after this to modify them all?

Perhaps

        data.each { |h|
            q = h.delete("questionid")
            event.set("question#{q}", h)
        }
    end

would work better for you, and maybe change the event.get to event.remove

Or if you need the array so that you can do the split then move the data using a ruby filter like this one.

That did it, I appreciate the assistance. I've said it before, I need to find time to learn ruby so I can do this kind of stuff myself...or at the very least understand what the hell is going on here.

Follow up question, can this ruby script be used to also remove the field if the value is null? I've tried using an additional ruby script I've used previously to remove empty fields, but it doesn't work on these fields, I'm assuming because they're nested.

      ruby {
        code => '
            event.to_hash.each { |k, v|
                if v.kind_of? String
                    if v == ""
                        event.remove(k)
                    end
                end
            }
        '
      }

That is correct. That ruby filter only checks the top-level fields. This is an example of ruby code that recursively descends into fields of an event and modifies them.

I'm very much lost in understanding most of this, but I THINK I changed it to remove null fields? Although, I think I'm missing something because I don't see where a target field is referencing...is it looking at the entire event or just the message field?
I'm not quite sure how to interpret what it's doing after the first couple logic checks. Define variable EmptyField, then if the object exists, and the object is a hash and not empty, then for each hash in the object do....something...

ruby {
        code => '
            def EmptyField(object, name, event)
                if object
                    if object.kind_of?(Hash) and object != {}
                        object.each { |k, v| EmptyField(v, "#{name}[#{k}]", event) }
                    elsif object.kind_of?(Array) and object != []
                        object.each_index { |i|
                            EmptyField(object[i], "#{name}[#{i}]", event)
                        }
                    else
                        lastElement = name.gsub(/^.*\[/, "").gsub(/\]$/, "")
                        if lastElement.length = "null"
                            event.remove(name)
                        end
                    end
                end
            end

            event.to_hash.each { |k, v|
                EmptyField(v, "[#{k}]", event)
            }
        '
    }

No...this did not work...appears to remove everything in the event, lol.

That is never going to be true, so yes, it will delete everything. I think you can replace

                    lastElement = name.gsub(/^.*\[/, "").gsub(/\]$/, "")
                    if lastElement.length = "null"

with

                    if object = "null"

I did catch that .length and removed it before testing it out. I made the change you suggested, but it still removes the whole event.

      ruby {
              code => '
                  def EmptyField(object, name, event)
                      if object
                          if object.kind_of?(Hash) and object != {}
                              object.each { |k, v| EmptyField(v, "#{name}[#{k}]", event) }
                          elsif object.kind_of?(Array) and object != []
                              object.each_index { |i|
                                  EmptyField(object[i], "#{name}[#{i}]", event)
                              }
                          else
                              if object = "null"
                                  event.remove(name)
                              end
                          end
                      end
                  end
      
                  event.to_hash.each { |k, v|
                      EmptyField(v, "[#{k}]", event)
                  }
              '
          }

I'm running this AFTER the first ruby script that restructures the object to below. Should it be running first?

"question": {
  "question2": {
    "textanswer": null
  },
  "question4": {
    "textanswer": null
  }
}

Sorry, that is an assignment so it is unconditionally true. Try if object == "null". Note that null in JSON will be nil in Ruby, so you may need if ! object instead.

I would run it after the restructuring.

Tried if object == "null", if object == "nil", and if ! object and none of them worked. They didn't delete the event, but they didn't seem to do anything either.

What does an event look like if you use

output { stdout { codec => rubydebug } }

(Doesn't have to be stdout, you could use a file output if it is more convenient.)

{
	"question39": {
		"dec": null,
		"question": "Sup Brah?",
		"bool": null,
		"date": null,
		"text": null,
		"int": null
	}
}

The full pipeline config that manipulates the data is here:

    if [question] {
      ruby {
        code => '
          data = event.get("question")
          if data.is_a? Array
            newData = []
            data.each { |h|
              q = h.delete("questionid")
              event.set("question#{q}", h)
            }
          end
        '
      }
      ruby {
              code => '
                  def EmptyField(object, name, event)
                      if object
                          if object.kind_of?(Hash) and object != {}
                              object.each { |k, v| EmptyField(v, "#{name}[#{k}]", event) }
                          elsif object.kind_of?(Array) and object != []
                              object.each_index { |i|
                                  EmptyField(object[i], "#{name}[#{i}]", event)
                              }
                          else
                              if object == "nil"
                                  event.remove(name)
                              end
                          end
                      end
                  end
      
                  event.to_hash.each { |k, v|
                      EmptyField(v, "[#{k}]", event)
                  }
              '
          }
      split {
        field => "question"
      }
      mutate {
        remove_field => [ "[question]" ]
      }
    }

The first thing that the EmptyField function tests is if object, so if ! object inside that branch cannot possibly test true.

input { generator { count => 1 lines => [ '{ "question39": { "dec": null, "question": "Sup Brah?", "bool": null, "date": null, "text": null, "int": null } }' ] codec => json } }
filter {
    ruby {
        code => '
            def EmptyField(object, name, event)
                if object.kind_of?(Hash) and object != {}
                    object.each { |k, v| EmptyField(v, "#{name}[#{k}]", event) }
                elsif object.kind_of?(Array) and object != []
                    object.each_index { |i|
                        EmptyField(object[i], "#{name}[#{i}]", event)
                    }
                else
                    if ! object
                        event.remove(name)
                    end
                end
            end

            event.to_hash.each { |k, v|
                EmptyField(v, "[#{k}]", event)
            }
        '
    }
}
output { stdout { codec => rubydebug { metadata => false } } }

results in

"question39" => {
    "question" => "Sup Brah?"
}
1 Like

Awesome thank you. This is where my inability to understand Ruby really shines, lol. Just one of the many things to add to the list of needing to give time to learn.