Logstash json to multiple documents

vikasp · November 7, 2019, 8:09pm

I have an input in the below format:
<data><te0><id>1</id><text>this is first event</text></te0><te1><id>2</id><text>this is second event</text></te1><te2><id>3</id><text>this is third event></text></te2><te0><id>4</id><text>this is fourth event</text></te0></data>

The above is a single input event to logstash. I want to convert the above single event to multiple events and push it to elasticsearch only for the te0 attribute value as a document identified by it's id. So, for the above example, result should be:
say index: xml_test
xml_test/doc/1/: { "_id": 1, "text": "this is first event" }
xml_test/doc/4/: { "_id": 4, "text": this is fourth event" }

Below is the logstash config I am trying to use:
grok {
match => [ "message", "%{GREEDYDATA:inxml}" ]
}
xml {
source => "inxml"
target => "xmldata"
force_array => false
}
json {
source => "xmldata"
}
split {
field => "xmldata"
}
}

I am getting _json_parse_failure, I see that it's already a json in elasticsearch and also _split_failure since it can only happen on string or array but says xmldata is a hash.
something like this:
"xmldata": {
"te0": {
"text": "this is a second doc",
"id": "1"
},
"te2": {
"text": " this is supposed to be 3rdor4th doc",
"id": "2"
}
},
and
"tags": [
"_jsonparsefailure",
"_split_type_failure"
],

How can I convert the above input to multiple docs for the values in attributes only inside

Badger · November 7, 2019, 8:42pm

What does your data look like?

vikasp · November 7, 2019, 8:52pm

Hi,
I did put a sample event, not sure what happened to it: here it is:
<data><te0><id>1</id><text>this is first event</text></te0><te1><id>2</id><text>this is second event</text></te1><te0><id>3</id><text>this is third event</text></te0><te2><id>4</id><text>this is fourth event</text></te2></data>

Badger · November 7, 2019, 11:35pm

Parsing your sample xml results in

   "xmldata" => {
    "te0" => [
        [0] {
            "text" => "this is first event",
              "id" => "1"
        },
        [1] {
            "text" => "this is third event",
              "id" => "3"
        }
    ],
    "te1" => {
        "text" => "this is second event",
          "id" => "2"
    },
    "te2" => {
        "text" => "this is fourth event",
          "id" => "4"
    }
},

You say you want ids 1 and 4. What test do you use to drop ids 2 and 3?

vikasp · November 8, 2019, 12:34am

All I want is to push to elasticsearch as seprate documents for id 1 and id 3 and remove anything else other than te0.
so kind out output should index two documents to elasticsearch. and below are the two docs:
doc 1 with id: 1 and doc as { "id":1, "text": "this is first event", "@timestamp":".....".......all metadata...}
doc 2 with id: 3 and doc as { "id":3, "text": "this is third event", "@timestamp":".....".......all metadata...}

Badger · November 8, 2019, 1:10am

So if you iterate over the members of xmldata you want to ignore any that are not arrays? And if they are arrays then split them?

vikasp · November 9, 2019, 6:31am

I want only the members under xmldata with attribute te0, and ignore everything (can be multiple te1, te2 ,te3 and so on). and most of them are arrays (te*)

vikasp · November 9, 2019, 6:34am

xml data will merge all te0s into one array and will have a single te0 key and first occurence in the xml of te0 will be 1st element in array and so on....Now, my split is creating documents with te0 [0] and the rest of the vars (te1, te2....), te0[1] and the rest of the vars(te1, te2....) and so no....
But I want is only te0[0] as one document without any other te*s, te[1] as another document.

Badger · November 9, 2019, 2:18pm

Try

    ruby {
        code => '
            event.get("xmldata").each { |k, v|
                unless k == "te0"
                    event.remove("[xmldata][#{k}]")
                end
            }
        '
    }
    split { field => "[xmldata][te0]" }

system · December 7, 2019, 2:31pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Turning one .xml into multiple events where not all values are always filled Logstash	5	408	May 16, 2019
Logstash, split event from an xml file in multiples documents keeping information from root tags Logstash	9	3269	July 6, 2017
How to create multiple events in logstash to push multiple request in elasticsearch Logstash	2	2350	November 9, 2017
Splitting an event into multiple documents Logstash	5	4035	January 31, 2018
Multiple events in the same input XML Logstash	9	1353	April 6, 2018

Logstash json to multiple documents

Related topics