Split an array of arrays

Hi all,
I'm currently working on parsing a really nasty xml stream: events come in single file form, can get up to 5000 lines and, most important, includes arrays of arrays which crucially hold the data I need to parse and visualize.

After some tinkering I come to the solution of using multiple split filters: one for main array and others for the sub-arrays, this solution however leaves me in the cold if the subarray includes fields which i have not included (case I fear will come up sooner or later).

Is there any way to make the Split filter recursive so that it could manage any kind of subarrays?
I understand this would be very load intensive, but given the scope of the project, this is something already expected.


Hi Opellulo,

Have you took a look at XML Filter?

filter {
  xml {
    source => "message"

Hope this can help!!

No, it will only split the field you tell it to split.

Thanks for the replies,
I suppose that the "correct" way to do that is through xpath (which to be honest i don't want to dive into unless someone can grant me beforehand it can handle nested arrays) so I think that for the moment I will stick to the multiple splits. It may not be elegant but it works.

After some testing it seems I find the error: the XML has an array node and multiples attributes in that very node and its children, so with a default xml filter configuration, logstash was mixing the values giving me a lot of headache.

The situation improved a lot after having changed those xml filter parameters:

force_array => false
force_content => true

All while keeping the split for the array node
Now I only got a cosmetic split error if the array node has only 1 value (it become an hash so it's not splittable) but i can live with it removing the tag

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.