S3 CSV Codec - Using a non printable character as separator

Is it possible to use a non printable character as a separator in the csv codec?

codec => csv {
          columns => ["col1", "col2", "col3"]
          charset => "UTF-8"
          separator => "\\u001F"
}

I have tried the following:
\u001f, \u001F, \\u001f, \\u001F, \u{001F}

some ideas?

If this is not possible, how would I have to modify the code to make it possible?

It's not the codec that you would modify to accept those UTF-16 strings, it is the compiler for the logstash configuration language.

However, there is no need to do that, you can just use the literal character. When logged into a UNIX host from a Windows environment I would use Ctrl/V Alt031 to generate it. od shows that as a 'us' character.

Yeah I know that I can copy paste it, but I wanted to avoid this option and stick with the \u representation or \x repr.

I have tried but my YAML is not happy with that character...

An interesting related post is here. In that case \u00xx works inside a string, but not because logstash decodes it. I would guess something in Manticore decodes it, or else on the receiving end in elasticsearch. It will not work here.

In the end I did my fork of the plugin to use that specific character.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.