Parse a json file that includes an xml

Hello all,

I want to send the following json document to elasticsearch through logstash.

     "short_message": "<?xml version="1.0" encoding="utf-8"?>
      <ImportMessageBase xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
      <Version>1.0.0</Version>
      <Direction>Inbound</Direction>
      <Topic>test</Topic>
      <ConversationId>{EA73CD72-96DC-EC2C-E053-3EA0010A1319}</ConversationId>
      <Entity>AVIS_LAGOS2BSM</Entity>
      <Company>07SC</Company>
      <Data>PD94bWwgdmVyc2lvbj0iMS4wIj8+CjxBVklTX0xBR09TMkJTTT4KICA8QVZfQVZJ=</Data>",
	  "host": "mock-service-6596db7999-m8r7c"} ' > /dev/udp/127.0.0.1/5041

I dont want to parse the "short_message" field and disolve its xml elements.

I just want to see in kibana the field: "short_message" that will include my whole xml message and the "host": "mock-service-6596db7999-m8r7c" as a separate field in the same indexed document.

In order to do so i send through my Ubuntu cmd the following command:

echo '{ "short_message": "<?xml version="1.0" encoding="utf-8"?>
      <ImportMessageBase xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
      <Version>1.0.0</Version>
      <Direction>Inbound</Direction>
      <Topic>test</Topic>
      <ConversationId>{EA73CD72-96DC-EC2C-E053-3EA0010A1319}</ConversationId>
      <Entity>AVIS_LAGOS2BSM</Entity>
      <Company>07SC</Company>
      <Data>PD94bWwgdmVyc2lvbj0iMS4wIj8+CjxBVklTX0xBR09TMkJTTT4KICA8QVZfQVZJ=</Data>",
	  "host": "mock-service-6596db7999-m8r7c"} ' > /dev/udp/127.0.0.1/5041

The json document reaches Elasticsearch but with a grok parse failure message.


My pipeline configuration is as follows:

input {

    gelf {

          codec => multiline {

            pattern => "<?xml version"

            what => "next"

           }

          port_udp => 5041

          use_udp => true

          id => "gelf"

   }

}

output {

    elasticsearch {

        hosts => ["http://localhost:9200"]

        index => "alex"

    }

}


Any possible solution concerning how to change my pipeline configuration file?

Thank you

Hello Alexandros,

I think your json document really is malformed:

{ "short_message": "<?xml version="1.0" encoding="utf-8"?>

You have to escape the quotes in the XML:

{ "short_message": "<?xml version=\"1.0\" encoding=\"utf-8\"?>

Best regards
Wolfram

Hello @ Wolfram_Haussig

Thank you very much for your reply.

I changed the xml part of my json as you advised to the following form:

echo '{ "short_message": "<?xml version=/"1.0/" encoding=/"utf-8/"?>
      <ImportMessageBase xmlns:xsi=/"http://www.w3.org/2001/XMLSchema-instance/" xmlns:xsd=/"http://www.w3.org/2001/XMLSchema/">
      <Version>1.0.0</Version>
      <Direction>Inbound</Direction>
      <Topic>test</Topic>
      <ConversationId>{EA73CD72-96DC-EC2C-E053-3EA0010A1319}</ConversationId>
      <Entity>AVIS_LAGOS2BSM</Entity>
      <Company>07SC</Company>
      <Data>PD94bWwgdmVyc2lvbj0iMS4wIj8+CjxBVklTX0xBR09TMkJTTT4KICA8QVZfQVZJ=</Data>",
	  "host": "mock-service-6596db7999-m8r7c"} ' > /dev/udp/127.0.0.1/5041

Nevertheless, my document is now spitted in 2 parts as the image show below with the (\t,\n, \ ) characters that i dont want to see them in kibana UI:

I also receive json grok parse failure as the image show below:

The relative log from logstash mentions the following:

[2022-10-29T14:35:07,679][ERROR][logstash.inputs.gelf     ][main][gelf] JSON parse failure. Falling back to plain-text {:error=>#<LogStash::Json::ParserError: Unexpected character ('1' (code 49)): was expecting comma to separate Object entries

 at [Source: (byte[])"{ "short_message": "<?xml version=/"1.0/" encoding=/"utf-8/"?>

      <ImportMessageBase xmlns:xsi=/"http://www.w3.org/2001/XMLSchema-instance/" xmlns:xsd=/"http://www.w3.org/2001/XMLSchema/">

      <Version>1.0.0</Version>

      <Direction>Inbound</Direction>

      <Topic>test</Topic>

      <ConversationId>{EA73CD72-96DC-EC2C-E053-3EA0010A1319}</ConversationId>

      <Entity>AVIS_LAGOS2BSM</Entity>

      <Company>07SC</Company>

      <Data>PD94bWwgdmVyc2lvbj0iMS4wIj8+CjxBVklTX0xBR09TMkJTTT4KICA8"[truncated 19 bytes]; line: 1, column: 38]>, :data=>"\"{ \\\"short_message\\\": \\\"<?xml version=/\\\"1.0/\\\" encoding=/\\\"utf-8/\>

< expected a valid value (JSON String, Number, Array, Object or token 'null', 'true' or 'false')

 at [Source: (byte[])"\u0009  "host": "mock-service-6596db7999-m8r7c"}

"; line: 1, column: 11]>, :data=>"\"\\t  \\\"host\\\": \\\"mock-service-6596db7999-m8r7c\\\"} \\n\""}


Any ideas on how to solve that one?

Thank you a lot in advance

These should be \", not /", throughout.

1 Like

Hello @Badger , @ Wolfram_Haussig

Thank you and sorry for my silly mistake.

I changed the message to the following format:

echo '{ "short_message": "<?xml version=\"1.0\" encoding=\"utf-8\"?>

      <ImportMessageBase xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\">

      <Version>1.0.0</Version>

      <Direction>Inbound</Direction>

      <Topic>test</Topic>

      <ConversationId>{EA73CD72-96DC-EC2C-E053-3EA0010A1319}</ConversationId>

      <Entity>AVIS_LAGOS2BSM</Entity>

      <Company>07SC</Company>

      <Data>PD94bWwgdmVyc2lvbj0iMS4wIj8+CjxBVklTX0xBR09TMkJTTT4KICA8QVZfQVZJ=</Data>",

          "host": "mock-service-6596db7999-m8r7c"} ' > /dev/udp/127.0.0.1/5041

Nevertheless, my message is splitted in 2 parts as the image show below:

More specifically, i receive again json grok parse failure and in the message section i see the escape " \ " symbol that i have inserted:

The relative logstash error log is exactly the following:

[2022-11-07T09:27:28,393][ERROR][logstash.inputs.gelf     ][main][gelf] JSON parse failure. Falling back to plain-text {:error=>#<LogStash::Json::ParserError: Illegal unquoted character ((CTRL-CHAR, code 10)): has to be escaped using backslash to be included in string value

 at [Source: (byte[])"{ "short_message": "<?xml version=\"1.0\" encoding=\"utf-8\"?>

      <ImportMessageBase xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\">

      <Version>1.0.0</Version>

      <Direction>Inbound</Direction>

      <Topic>test</Topic>

      <ConversationId>{EA73CD72-96DC-EC2C-E053-3EA0010A1319}</ConversationId>

      <Entity>AVIS_LAGOS2BSM</Entity>

      <Company>07SC</Company>

      <Data>PD94bWwgdmVyc2lvbj0iMS4wIj8+CjxBVklTX0xBR09TMkJTTT4KICA8"[truncated 19 bytes]; line: 1, column: 64]>, :data=>"\"{ \\\"short_message\\\": \\\"<?xml version=\\\\\\\"1.0\\\\\\\" encoding=\\\\\\>

[2022-11-07T09:27:29,967][ERROR][logstash.inputs.gelf     ][main][gelf] JSON parse failure. Falling back to plain-text {:error=>#<LogStash::Json::ParserError: Unexpected character (':' (code 58)): expec>

 at [Source: (byte[])"\u0009  "host": "mock-service-6596db7999-m8r7c"}

"; line: 1, column: 11]>, :data=>"\"\\t  \\\"host\\\": \\\"mock-service-6596db7999-m8r7c\\\"} \\n\""}


Any ideas on that one??

Thank you for your time.

Best regards,
Alexandros

Hello @Badger , @ Wolfram_Haussig

Any possible feedback on that one?

Thank you in advance,

Best regards,
Alexandros

It is telling you that you cannot have an unquoted newline within a JSON object.

Hello @Badger,

Thank you, So in order to understand since what i want to send is the following message:

echo '{ "short_message": "<?xml version=\"1.0\" encoding=\"utf-8\"?>

      <ImportMessageBase xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\">

      <Version>1.0.0</Version>

      <Direction>Inbound</Direction>

      <Topic>test</Topic>

      <ConversationId>{EA73CD72-96DC-EC2C-E053-3EA0010A1319}</ConversationId>

      <Entity>AVIS_LAGOS2BSM</Entity>

      <Company>07SC</Company>

      <Data>PD94bWwgdmVyc2lvbj0iMS4wIj8+CjxBVklTX0xBR09TMkJTTT4KICA8QVZfQVZJ=</Data>",

          "host": "mock-service-6596db7999-m8r7c"} ' > /dev/udp/127.0.0.1/5041

How should i transform it so that i dont have that previous error anymore?

Best regards,
Alexandros