Extracting a tag from an XML file using RSS input filter

On Logstash the RSS input filter is not getting the 'enclosure' or 'subtitle' items below. How do I force those items to be logged?

<itunes:category text="History"/>

Day
https://year/day
Sat, 13 Nov 2021
Formation

itunes:author Formation </itunes:author>
itunes:subtitle subtitle </itunes:subtitle>
itunes:duration 26:00 </itunes:duration>

I don't think you can. The rss input just processes a fixed set of required elements.

Can optional elements be added to the RSS filter or any other filter?
Is there a way to find out about the rss filter and what elements it processes?
How about using Grok or Dissect?

You could follow the link that I put in my answer, which goes to the code in rss filter that extracts elements.

Conceptually the filter just does an HTTP get, which you could do with an http_poller input, and parses the XML result, which you could do with xml and split filters. Or you could modify the code of the input and build your own version.

Thanks for the help and Tips. To modify the RSS input filter can I just find and modify the existing rss.rb file on my system and test and use it to try any changes?

I hope someone else will answer this, but my understanding is that the logstash package includes Ruby "gem" packages, which in turn include both source and architecture specific binary.

If that is correct then modifying the source will not modify the binary that logstash loads.

The README at github for the rss filter is the boilerplate documentation for how to build your plugin (which I think assumes you forked the repo).

I found and modified the RSS filter source file RSS.RB and the changes I made were effective. Don't know if the source got auto compiled or not. But It allowed me to grab a few more xml fields that I needed.
Thanks for the help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.