Repeated delimiters with dissect , how to handle?

Hi,

Is there a way to change the default behaviour of dissect filter that takes repeated delimiters for single one?

It causes problem with empty fields between same delimiters, like for its comma delimited string

Aaa,bbb,,ccc,,,,ddd

Dissect would match it for four first values, but I would like to have explicit empty values instead. I do not know beforehand which fields would end up empty. Is that possible without going back to CSV filter?

One cannot yet change this behaviour.

Dissect consumes multiple delimiters because of space padding.

When certain sections are padded with spaces depending on the amount of characters in the section, a person creating a dissection will not know how many spaces to use as the delimiter.

2017-06-28 12:12:12                  SHORT: some message
2017-06-28 12:12:13 SERIOUSLYEXTREMELYLONG: some message
2017-06-28 12:12:14           VERYVERYLONG: some other message

The spaces after the date must be seen as one delimiter.

There is an enhancement request about this. https://github.com/logstash-plugins/logstash-filter-dissect/issues/11
If you wish, you could add your +1 :thumbsup: to it.

Hi Guy,

Thanks for the reply! Yes, joining spaces together is a large use case; but it would be great if dissect could switch this behaviour on and off by an option.

I was looking at the code some minutes ago - it does not seem difficult. I will not be able to do an update for a few days though.

@Grigory_Shamov1

I have been thinking. How about if I added a suffix to indicate that the delimiter following this field should be greedy? I'm thinking ->, meaning that users with space padded text have to opt in. It also means not not having to commit the whole dissection to one behaviour.
Example:
Data:

2017-06-28 12:12:12                  SHORT: f1,,f3,,,f6

Mapping:

%{date/1} %{+date->} %{APP}: %{csv1},%{csv2},%{csv3},%{csv4},%{csv5},%{csv6}

What do you think?

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.