Which (json or xml) input is better for parsing in logstash in terms of scalibility , performance and nested structure?

hi everyone ,

i need to decide which input type to parse in logstash for performance, scalibility and nested structure ?
the input file could be nearly 100k records in size
the data source is informatica.

please let me know in case of any info.

HI,

If you are able to use JSON then do that. XML has multiple drawbacks like:

  • larger documents than JSON
  • does not support Arrays
  • vulnerabilities like external XML entities

I also think that parsing JSON is faster than XML but that might depend on your data. If you want to be sure test it yourself.

Best regards
Wolfram

ok thanks @Wolfram_Haussig :slight_smile:

my initial data looks like below :-

<metadata .....>
<total_doc_processed></total_doc_processed>
     <main_field>
    			<main_sub_field>
    			<field1></field1>
    			<field2></field2>
    			<field3></field3>
    			<field4></field4>
    			<field5></field5>
    			<field6></field6>
    			<field7></field7>
    			<field8></field8>
    			</main_sub_field>
    			<main_sub_field>
    			<field1></field1>
    			<field2></field2>
    			<field3></field3>
    			<field4></field4>
    			<field5></field5>
    			<field6></field6>
    			<field7></field7>
    			<field8></field8>
    			</main_sub_field>
        ....
        ....
    </main_field>
</metadata>

each main_sub_field will be a record in elasticsearch.....

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.