Repeated delimiters with dissect , how to handle?

Grigory_Shamov1 · June 27, 2017, 12:00am

Hi,

Is there a way to change the default behaviour of dissect filter that takes repeated delimiters for single one?

It causes problem with empty fields between same delimiters, like for its comma delimited string

Aaa,bbb,,ccc,,,,ddd

Dissect would match it for four first values, but I would like to have explicit empty values instead. I do not know beforehand which fields would end up empty. Is that possible without going back to CSV filter?

guyboertje · June 28, 2017, 1:54pm

One cannot yet change this behaviour.

Dissect consumes multiple delimiters because of space padding.

When certain sections are padded with spaces depending on the amount of characters in the section, a person creating a dissection will not know how many spaces to use as the delimiter.

2017-06-28 12:12:12                  SHORT: some message
2017-06-28 12:12:13 SERIOUSLYEXTREMELYLONG: some message
2017-06-28 12:12:14           VERYVERYLONG: some other message

The spaces after the date must be seen as one delimiter.

There is an enhancement request about this. https://github.com/logstash-plugins/logstash-filter-dissect/issues/11
If you wish, you could add your +1 to it.

Grigory_Shamov1 · June 28, 2017, 2:52pm

Hi Guy,

Thanks for the reply! Yes, joining spaces together is a large use case; but it would be great if dissect could switch this behaviour on and off by an option.

guyboertje · June 28, 2017, 3:08pm

I was looking at the code some minutes ago - it does not seem difficult. I will not be able to do an update for a few days though.

guyboertje · June 29, 2017, 9:09am

@Grigory_Shamov1

I have been thinking. How about if I added a suffix to indicate that the delimiter following this field should be greedy? I'm thinking ->, meaning that users with space padded text have to opt in. It also means not not having to commit the whole dissection to one behaviour.
Example:
Data:

2017-06-28 12:12:12                  SHORT: f1,,f3,,,f6

Mapping:

%{date/1} %{+date->} %{APP}: %{csv1},%{csv2},%{csv3},%{csv4},%{csv5},%{csv6}

What do you think?

system · July 27, 2017, 9:10am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash Dissect Filter - Multiple Delimiters? Logstash	1	808	December 4, 2018
Dissect filter - Empty data between delimiter: unexpected behavior Logstash	1	232	April 8, 2020
Help using logstash dissect filtering Logstash	3	326	June 18, 2019
Understanding dissect behaviour Logstash	8	910	September 30, 2019
Is it possible to include regex in dissect filter? Logstash	5	1781	August 1, 2018

Repeated delimiters with dissect , how to handle?

Related topics