Try reading through this link, particularly the part about message queueing.
If you are using a pipeline that has a message queue, each tier has it's own purpose. How logstash is configured changes depending on what you want it to do.
Collector = Pretty much replaced by beats instead of using logstash. This runs on the server that is generating the logs, and sends data to a shipper. There may be hundreds of these all sending data to the same shipper machine.
Shippers = Receives data from lots of different sources and machines. Ideally does the minimal amount of processing possible (No filters) and then sends it off to a message queue such as Kafka or Redis.
Processors (I believe this would be the same as indexer) = Pulls data from the message queue, applies whatever filters are necessary, and then inserts it into Elasticsearch. These tend to be more CPU hungry than the shippers.
This creates a lot of extra servers to manage, but it offers a lot of benefits too. If all you need to do is log a few thousand messages per day, none of this would really apply. If you want to log 500 million messages per day, you won't be able to get away with a simple setup.