Logstash CSV Filter Using Unicode delimiter ( SOH )

HI guys,
I am trying to parse a CSV File which has a delimiter of (SOH) i.e Hexa value \x01 ( unicode value \u0001 ), but logstash csv filter seems to ignore the seperator mentioned .
example

filter{
csv{
separator => "\u0001"
}
}

Does logstash csv filter support unicode characters or special characters as delimiter???

Thanks,
sam

The string type documentation doesn't mention anything about this so one should assume escape sequences are unsupported. See related thread below:

Hi magnus,
is this a feature not present in logstash csv filter or a fundamental limitation that logstash cannot support this as it doesnt allow unicode characters in its string data type.

Thanks,
sam

Logstash strings do not support escape sequences as a way to represent non-printable characters. If you can't put a literal \u0001 in the file (which the topic I linked to indicated didn't work) you're out of luck. This has nothing to do with the csv filter.

I worked around this issue, i used a mutate filter as given below to replace the unicode delimiter to ascii then parse using csv

    mutate{
        gsub => [ "message","\u0001","," ]
    }
1 Like