Document structure for recursively nested fields

Hi there,

I am trying to use ElasticSearch (with a Kibana frontend) to extract statistics on many RobotFramework test runs.

RobotFramework outputs an XML file for each test execution, which has the following structure :

<?xml version="1.0" encoding="UTF-8"?>
<robot rpa="false">
    <suite source="[...]" id="s1" name="TestSuite">
         <suite source="[...]" id="s1-s1" name="TestSubSuite">
              <test id="s1-s1-t1" name="Passing test">
                  <kw name="myKeyword">
                      <kw name="nestedKeyword">
                      </kw>
                  </kw>
                  <status status="PASS" endtime="[...]" critical="yes" starttime="[...]">All good</status>
              </test>
              <test id="s1-s1-t1" name="Passing test">
                  <kw name="Log" library="BuiltIn">
                  </kw>
                  <status status="PASS" endtime="[...]" critical="yes" starttime="[...]">All good</status>
              </test>
        </suite>
    </suite>
</robot>

As you can see, there are 2 main issues :

  1. The information is nested into several level
  2. There is a undeterminate level of nesting (there can be a robot.suite.test.status but also a robot.suite.suite.suite.test.status)

At the moment, I am parsing the document using logstash as so :

input {
    file {
        path => "[...]/Outputs/*.xml"
        start_position => "beginning"
        sincedb_path => "NUL"
        codec => multiline {    
            pattern => "^<\?xml .*\?>"
            negate => true
            what => "previous"
            auto_flush_interval => 1
            max_lines => 100000
        }
    }
}

filter {
    xml { 
        source => "message"
        target => "RBF"
        store_xml => true
        suppress_empty => false
        force_array => false
    }
    mutate {
        remove_field => ["message","[RBF][statistics]","[RBF][errors]","[RBF][rpa]","[RBF][generator]"]
    }
    split { 
        field => "[RBF][suite][test]"
    }
}

output {
    file {
        path => "[...]"
        write_behavior => "append"
        codec => rubydebug
    }
    elasticsearch {
        hosts => "localhost:9200"
        manage_template => false
        index => "logstash-robotframework-%{+YYYY.MM.dd}"
    }
}

Which gives me one separate document per test if there is only 1 level of nesting, but doesn't split if the tests are nested in 2 levels of suite or more.

Here is an exemple of document generated when the parsing works :

{
    "@timestamp" => 2019-07-12T15:18:04.106Z,
      "@version" => "1",
          "tags" => [
              [0] "multiline"
           ],
           "RBF" => {
            "suite" => {
                "id" => "s1",
            "source" => "[...]",
              "name" => "Dummy3",
            "status" => {
                  "endtime" => "20190712 17:03:06.328",
                   "status" => "PASS",
                "starttime" => "20190712 17:03:06.265"
            },
              "test" => {
                  "name" => "Test 1.3",
                    "kw" => {
                      "library" => "BuiltIn",
                          "doc" => "Skips rest of the current test, setup, or teardown with PASS status.",
                         "name" => "Pass Execution",
                       "status" => {
                          "endtime" => "20190712 17:03:06.328",
                           "status" => "PASS",
                        "starttime" => "20190712 17:03:06.328"
                    },
                    "arguments" => {
                        "arg" => "All good in the hood"
                    },
                          "msg" => {
                            "level" => "INFO",
                          "content" => "Execution passed with message: All good in the hood",
                        "timestamp" => "20190712 17:03:06.328"
                    }
                },
                "status" => {
                      "endtime" => "20190712 17:03:06.328",
                     "critical" => "yes",
                       "status" => "PASS",
                      "content" => "All good in the hood",
                    "starttime" => "20190712 17:03:06.328"
                },
                    "id" => "s1-t3"
            }
        },
        "generated" => "20190712 17:03:06.265"
    },
          "host" => "sahnlpt0262",
          "path" => "[...]"
}

I would like if possible to have one separate document per test, at any level of nesting, but keeping the surrounding information (the "robot" arguments, the parent suites, the suite's status etc.)

How should I organize my document?

Thanks in advance!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.