Making the best of an existing Syslog stream

I would start with finding out whether you can use a syslog input. Syslog is many things to many people. A syslog input expects RFC 3164 messages. Otherwise you would might use a TCP input. (Or there was a recent post suggesting using rsyslog to talk to all those syslog daemons and forward to logstash.)

If you need to selectively apply a multiline codec you may want to have multiple pipelines. Have your main pipeline figure out what multiline treatment based on host or a pattern match or whatever then use tcp output/input pairs to apply the multiline codec. If you cannot get multiline it might be possible to use aggregate, but any pipelines running aggregate are restricted to a single pipeline worker thread (also, you want pipeline.java_execution set to false in logstash.yml until this bug is fixed).

In a past experience taking on several new data feeds at the same time I found it helpful to tag data once I thought it was being parsed correctly and feed that to a different index. All the stuff that had not been parsed properly was fed into a 'fixme' index.