Csv filter plugin not working

its-ogawa · April 9, 2021, 2:18am

Is this a Kibana problem or a logstash problem?

leandrojmp · April 9, 2021, 2:29am

Can you explain what you are trying to do with this config?

This config will use dissect to parse your message, than it will create a field named a with the value x, your add_field makes no sense, you do not have a field named ia, so it will add the literal value of %{ia} into the field a.

Everything that you use in the sprintf format, %{something}, means that you want to use the value stored in the field.

For example, if you have a field named level in your document with the value equal as INFO, then using add_field => { "test" => "%{level}" } will create a field named test with the value of the field level, in this case INFO, if the field level does not exist, than the field test will receive the literal value %{level}.

For what was possible to see in your Kibana screenshot you are trying to add fields that does not exist in your document.

its-ogawa · April 9, 2021, 3:20am

Thanks for the reply.

Can you explain what you are trying to do with this config?

I have a log with the following format.

timestamp \t [thread-name] \t log-level \t class-name - log-message

When I collect logs in logstash, all this logs are kept in a field named message.
I would like to keep each category of format in a separate field.

There are two difficulties.

One is that the delimiter of the format is a tab, and logstash does not recognize tabs well.

The other is that the variables in the log that would have been split are displayed as is.

its-ogawa · April 9, 2021, 3:23am

you do not have a field named ia , so it will add the literal value of %{ia} into the field a .

To your point, I don't think I've completed the first task.

Could you please tell me how to correctly split the above format?

its-ogawa · April 9, 2021, 3:27am

This config will use dissect to parse your message, than it will create a field named a with the value x, your add_field makes no sense,

Furthermore, from this point of view, it seems that add_field is not necessary either.

gork is a conundrum for me.
dissect was taught to me by Badger.

Please let me know which plugins can successfully decipher the above formats and how to use them.

leandrojmp · April 9, 2021, 4:08am

This is no problem for dissect, you just need to use a literal tab in your mapping pattern.

Using your example, if you have a message like this one, where the fields are delimited by tabs:

2021-04-09 21:30:00\t[someThread]\tINFO\tsome.Class.Name - sample message

You would need a dissect filter like this one:

dissect {
        mapping => { "message" => "%{timestamp}TAB[%{thread-name}]TAB%{log-level}TAB%{class-name} - %{msg}"}
}

Where you have TAB in the above configuration you would need to press the TAB key to insert a literal tab character, you need to make sure that the editor you are using is really inserting a TAB and not swapping a TAB character for a number of spaces.

If it is correct, you will have an output like this:

{
      "log-level" => "INFO",
       "@version" => "1",
     "@timestamp" => 2021-04-09T04:00:52.945Z,
     "class-name" => "some.Class.Name",
    "thread-name" => "someThread",
        "message" => "2021-04-09 21:30:00\t[someThread]\tINFO\tsome.Class.Name - sample message",
           "host" => "elk",
            "msg" => "sample message",
      "timestamp" => "2021-04-09 21:30:00"
}

You could also use a mutate filter to change the delimiter in your message from tab to another character, like a pipe, |.

filter {
    mutate {
        gsub => ["message", "\t", "|"]
    }
    dissect {
        mapping => { "message" => "%{timestamp}|[%{thread-name}]|%{log-level}|%{class-name} - %{msg}"}
    }
}

This would change the tabs in the message field to pipes |, and the following dissect will parse your message.

I'm sorry, I do not what you mean with this.

its-ogawa · April 9, 2021, 11:41am

Thanks for the advice.
Your tab-to-pipe change idea worked well.
I can't thank you enough!!!

Where you have TAB in the above configuration you would need to press the TAB key to insert a literal tab character, you need to make sure that the editor you are using is really inserting a TAB and not swapping a TAB character for a number of spaces.

This time I failed because of the TAB problem you are talking about.

I also tried the "literal tab character" but it didn't work.

I'm not sure what the cause is, but is it possible that the "literal tab character" of the server that is outputting the logs is different from the "literal tab character" of the server that logstash is running on?

If you can't reproduce the same "literal tab character" due to editor settings or some other problem, I think it would be a tricky problem.

leandrojmp · April 9, 2021, 1:07pm

A Tab character is always a Tab, it has the same ascii code, the problem could be in the editor used to write the pipeline configuration.

Some editors automatically change the Tab character and use 4 or 8 spaces, visually it could looks the same, but it is something different, it is really a tricky problem.

its-ogawa · April 13, 2021, 1:07am

Thank you for your comment.

Is it a problem with the editor?
That's very difficult.

It would be nice if dissect could use \t like mutate.

system · May 11, 2021, 1:07am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash dissect plugin prase doesn't correct with empty field in content Logstash	6	1646	November 30, 2017
I Can't Parse My Logs with Grok or Dissect Filter Logstash	6	576	February 3, 2019
Parsing problem with both csv and dissect Logstash	4	732	August 20, 2018
How to parse mix json logs Logstash	28	4567	March 25, 2019
Problem with parsing multiline filter plugin Logstash	17	1672	July 13, 2018

Csv filter plugin not working

Related topics