I am still a little new at this, but I believe it has to do with the how the event gets processed. An event is created from the input, passed through the filters, and then passed into the output. It then needs to traverse each output before it is finished. Read this for more info on how it works.
Logstash only works with a handful of events at a time. Filters and Outputs were sort of combined recently, but think of your Input and Outputs as separate applications that work independently from each other. Your input takes data and puts it into a queue. That queue has a max size (20 events I think). Once it hits that max, the Input stops receiving data and putting it into the queue, essentially putting back pressure on whatever is feeding it. The filter/output checks this queue and pulls in events for processing. However it does not go back to the queue for more until it is completely done with the event. This means going through every single outputs.
In other words.
- Event is sent to Logstash, received by Input
- Event is processed by Input and put into queue.
- Filter/Output checks the queue for new events, finds one and pulls it into the processing pipeline.
- Event is sent to Elastic cluster
- Event is sent to TCP location.
- Filter/Output checks for new events in queue.
So what is happening is that step 5 is taking longer than anything else. But it can't go to step 6 and pull in a new event until it is done with 5. So the TCP output slows down everything, even back to step 1. Remember that the Input can't receive data if the queue is full. The entire structure is only as fast as it can output data.
I can't really think of a good way to change it though. For example, if Elastic can handle 10k events per second, and TCP can handle 5k per second, then there is a 5k per second difference. What would you do with those events? They would have to be go somewhere. I guess in theory you could add a queue in between steps 4 and 5 and then have it go back to step 1 to receive more data. That would allow everything to be sent to Elastic immediately. But how big would you allow that queue to get? It would increase by 5k events every single second in our example. If you had a big peak during the day it could get caught up overnight, but if the data is consistently high it would get bigger forever and ever.