Terizian
(Terizian)
June 28, 2020, 9:56am
1
I was trying the RSS plugin with Logstash, and I'm facing an error as detailed below:
Version :
Logstash 7.7.1
Operating System:
CentOS 7.8.2003
Config File:
input {
rss {
url => "https://www.alittihad.ae/arabi.rss"
interval => 7200
tags => ["ar", "rss", "alittihad"]
}
}
filter {
fingerprint{
source => "title"
target => "[@metadata][fingerprint]"
method => "MURMUR3"
}
}
output {
elasticsearch {
action => "index"
hosts => "localhost"
workers => 1
document_id => "%{[@metadata][fingerprint]}"
}
stdout {}
}
Error:
[ERROR][logstash.inputs.rss ][main][28d92d7fa1e60631ff0741642cbb39e90c8c542b6251b01324014030be140f75] Uknown error while parsing the feed {:url=>"https://www.alittihad.ae/arabi.rss", :exception=>#<RSS::MissingAttributeError: attribute <url> is missing in tag <source>>}
Badger
June 28, 2020, 3:56pm
2
If you hit that in a browser then you see source elements such as
<source>
<![CDATA[ ]]>
</source>
The RSS spec requires that a source element have a url attribute, so this is not valid RSS.
Terizian
(Terizian)
June 28, 2020, 4:53pm
3
Thank you for your response! I am aware that the RSS is faulty. However, I tried with another RSS document (https://www.emaratalyoum.com/1.533091?ot=ot.AjaxPageLayout ) which didn't have a tag and it worked fine.
Any chance errors can be bypassed in the plugin?
Badger
June 28, 2020, 5:08pm
4
The input does not have a a way to suppress the error.
Terizian
(Terizian)
June 28, 2020, 5:20pm
5
What do you think could be a better way of dealing with this? I'd really like to use Logstash in my pipeline.
Badger
June 28, 2020, 5:30pm
6
You might be able to use mutate+gsub to remove the empty source elements.
Terizian
(Terizian)
June 28, 2020, 6:02pm
7
Thanks for the advice. I'll see what I can do about it
system
(system)
Closed
July 26, 2020, 6:02pm
8
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.