Document structure for recursively nested fields

florent · July 13, 2019, 9:05am

Hi there,

I am trying to use ElasticSearch (with a Kibana frontend) to extract statistics on many RobotFramework test runs.

RobotFramework outputs an XML file for each test execution, which has the following structure :

<?xml version="1.0" encoding="UTF-8"?>
<robot rpa="false">
    <suite source="[...]" id="s1" name="TestSuite">
         <suite source="[...]" id="s1-s1" name="TestSubSuite">
              <test id="s1-s1-t1" name="Passing test">
                  <kw name="myKeyword">
                      <kw name="nestedKeyword">
                      </kw>
                  </kw>
                  <status status="PASS" endtime="[...]" critical="yes" starttime="[...]">All good</status>
              </test>
              <test id="s1-s1-t1" name="Passing test">
                  <kw name="Log" library="BuiltIn">
                  </kw>
                  <status status="PASS" endtime="[...]" critical="yes" starttime="[...]">All good</status>
              </test>
        </suite>
    </suite>
</robot>

As you can see, there are 2 main issues :

The information is nested into several level
There is a undeterminate level of nesting (there can be a robot.suite.test.status but also a robot.suite.suite.suite.test.status)

At the moment, I am parsing the document using logstash as so :

input {
    file {
        path => "[...]/Outputs/*.xml"
        start_position => "beginning"
        sincedb_path => "NUL"
        codec => multiline {    
            pattern => "^<\?xml .*\?>"
            negate => true
            what => "previous"
            auto_flush_interval => 1
            max_lines => 100000
        }
    }
}

filter {
    xml { 
        source => "message"
        target => "RBF"
        store_xml => true
        suppress_empty => false
        force_array => false
    }
    mutate {
        remove_field => ["message","[RBF][statistics]","[RBF][errors]","[RBF][rpa]","[RBF][generator]"]
    }
    split { 
        field => "[RBF][suite][test]"
    }
}

output {
    file {
        path => "[...]"
        write_behavior => "append"
        codec => rubydebug
    }
    elasticsearch {
        hosts => "localhost:9200"
        manage_template => false
        index => "logstash-robotframework-%{+YYYY.MM.dd}"
    }
}

Which gives me one separate document per test if there is only 1 level of nesting, but doesn't split if the tests are nested in 2 levels of suite or more.

Here is an exemple of document generated when the parsing works :

{
    "@timestamp" => 2019-07-12T15:18:04.106Z,
      "@version" => "1",
          "tags" => [
              [0] "multiline"
           ],
           "RBF" => {
            "suite" => {
                "id" => "s1",
            "source" => "[...]",
              "name" => "Dummy3",
            "status" => {
                  "endtime" => "20190712 17:03:06.328",
                   "status" => "PASS",
                "starttime" => "20190712 17:03:06.265"
            },
              "test" => {
                  "name" => "Test 1.3",
                    "kw" => {
                      "library" => "BuiltIn",
                          "doc" => "Skips rest of the current test, setup, or teardown with PASS status.",
                         "name" => "Pass Execution",
                       "status" => {
                          "endtime" => "20190712 17:03:06.328",
                           "status" => "PASS",
                        "starttime" => "20190712 17:03:06.328"
                    },
                    "arguments" => {
                        "arg" => "All good in the hood"
                    },
                          "msg" => {
                            "level" => "INFO",
                          "content" => "Execution passed with message: All good in the hood",
                        "timestamp" => "20190712 17:03:06.328"
                    }
                },
                "status" => {
                      "endtime" => "20190712 17:03:06.328",
                     "critical" => "yes",
                       "status" => "PASS",
                      "content" => "All good in the hood",
                    "starttime" => "20190712 17:03:06.328"
                },
                    "id" => "s1-t3"
            }
        },
        "generated" => "20190712 17:03:06.265"
    },
          "host" => "sahnlpt0262",
          "path" => "[...]"
}

I would like if possible to have one separate document per test, at any level of nesting, but keeping the surrounding information (the "robot" arguments, the parent suites, the suite's status etc.)

How should I organize my document?

Thanks in advance!

system · August 10, 2019, 9:05am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Recursive nested documents in elasticsearch Elasticsearch	10	4597	March 6, 2017
Complex JSON structure - Mapping Elasticsearch	1	577	July 6, 2020
Good mapping for recursively nested documents Elasticsearch	1	1018	July 1, 2019
Handle Nested data in elasticsearch Elasticsearch	3	395	August 18, 2020
Getting Nested Fields Elasticsearch Filter Logstash	2	994	March 4, 2018

Document structure for recursively nested fields

Related topics