Logstash - elasticsarch and double quotes

I'm trying to ingest logs wich have " in some of their fields as part of a string, I'm trying to scape them with \ or \ or \\ but no luck, so, is there any way to ingest double quotes from logs into elasticsearch using logstash?


Why do you feel the need to escape them?

Because I'm seeing errors in output when logstash is trying to "face" a field like: charCHar"charCHAr

What errors are you seeing?

For example:

field=>"message", :source=>"string,string"19\r", :exception=>#<CSV::MalformedCSVError: Illegal quoting in line 1.>}

The CSV format requires that if a field is quoted (presumably because it may contain a comma) then the entire field must be quoted. That is, the first and last characters must be quotes. You cannot have additional characters such as a carriage return at the end of the field.

If the problem is just a \r at the end of the line you can remove it using mutate+gsub.

I'll try, but the problem is, I guess, that the fuield is not quoted in the csv file, it may contain X double quote(s) or 0 in a field, like this:

fieldstringA, field"str""ingB\r
fieldstringA, fieldstringB\r
fieldstringA, fieldst"ringB\r

If the fields never contain commas then another option is to replace the quotes with something else, run the csv filter, then gsub them back. I have not tested it, but something like

mutate { gsub => [ "message", '"', ":#$%^^&" ] }
csv { ... }
ruby {
    code => '
        event.to_hash.each { |k, v|
            if v.is_a? String
                event.set(k, v.gsub(":#$%^^&", "\""))

So, is not possible to ingest doublequotes? That's my problem, double quotes are part of some fields and what I need is to ingest them :frowning:

It is certainly possible to ingest double quotes, but if you want to use a csv filter then the message has to be a properly formatted CSV line. And "properly formatted" constrains the ways in which double quotes can appear.

In fact the file I am trying to import is simply a list of text strings, formed by two values and separated by a constant value, without headers or too much complexity. Would it be perhaps interesting to use another import format, such as Json?

You might be better off replacing the csv filter with dissect if I understood that correctly.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.