Is there a way to change the default behaviour of dissect filter that takes repeated delimiters for single one?
It causes problem with empty fields between same delimiters, like for its comma delimited string
Aaa,bbb,,ccc,,,,ddd
Dissect would match it for four first values, but I would like to have explicit empty values instead. I do not know beforehand which fields would end up empty. Is that possible without going back to CSV filter?
Dissect consumes multiple delimiters because of space padding.
When certain sections are padded with spaces depending on the amount of characters in the section, a person creating a dissection will not know how many spaces to use as the delimiter.
2017-06-28 12:12:12 SHORT: some message
2017-06-28 12:12:13 SERIOUSLYEXTREMELYLONG: some message
2017-06-28 12:12:14 VERYVERYLONG: some other message
The spaces after the date must be seen as one delimiter.
Thanks for the reply! Yes, joining spaces together is a large use case; but it would be great if dissect could switch this behaviour on and off by an option.
I have been thinking. How about if I added a suffix to indicate that the delimiter following this field should be greedy? I'm thinking ->, meaning that users with space padded text have to opt in. It also means not not having to commit the whole dissection to one behaviour.
Example:
Data:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.