How to remove %3A and such hash in text while parsing?

Hello, I have a field in my logs:
"name+Of+the%3A%C2%AE+field"
which sometimes can be like this:
"%D9+%81+%D8+%B1+%D8+%A7+%D8"
How can I remove extra expressions in logstash filters? The desired output for the first example would be "name Of the field" and for the second one, it should be an empty text.
Thanks in advance!

That looks like you have the wrong text encoding, but if you do not want to fix that and just want to remove stuff you could try

mutate {
    gsub => [
        "message", "%[A-F0-9]{2}", "", 
        "message", "+", " ", 
        "message", "\s+", " "
    ]
}
if [message] == " " {
    mutate { replace => { "message" => "" } }
}
1 Like

Thank you so much. for anyone getting the + operator error, use "[+]" instead of "+"

Hi again, after adding this, I have encountered the issue of:

Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>"2373695,
...
"_type"=>"_doc", "_id"=>"2373695326", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"mapper [timestamp] cannot be changed from type [text] to [date]"}}}}

I have a timestamp field besides @timestamp, which has text type. only one of my logs within this index does not get parsed. when I restart filebeat multiple times, it fetches the log but sometimes it works and the other times not. What might it be about? Thanks!

A field in elasticsearch must have the same type on every document. Once it is set to text it cannot be changed to date until it rolls over to a new index. The type will get set by dynamic mapping unless you have an index template.

Thanks but my timestamp field is set to text in all filters and the default @timestamp is date.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.