How to improve performance of elasticsearch input in Logstash

I want to use logstash to pull current elasticsearch documents, run them through filters, and then update them in the same cluster. I am trying to figure out how to get the best performance, and I'm first looking at the elasticsearch plugin. I know that pipeline workers only apply to the filter and output plugins.

I have 3 ES nodes all on the same subnet. I created a new Ubuntu 14.04 VM in the same subnet and installed LS 5.0-alpha5 from the debian package. The VM has 8GB of RAM and 4 cores.

Taking a tip from a recent elastic blog post, I decided to setup a logstash config file like the following:

input {
     elasticsearch {
         hosts => ["hostA","hostB","hostC"]
         index => "specific-index"
         docinfo => true
         query => '
         {
           "query": {
             "match": {
               "_type": {
                 "query": "MyType"
               }
             }
           }
          }
         '
     }
}

output {
      stdout { codec => dots }
}

Then I run the following command:

sudo ./logstash -f /etc/logstash/conf.d/logstash.conf --path.settings=/etc/logstash | pv -Wart > /dev/null

When I run this I get around 3 - 4 kB/s tops.

I'm wondering what I can do to improve the throughput of this logstash script / elasticsearch input plugin. Are there any settings that could improve this? Anything on the elasticsearch nodes that need to change?

Are you monitoring everything to see where it may be limited?

@warkolm I'm not sure what you mean by monitoring. Are you referring to the Monitoring API's in LS 5.0? Or is there another way that should monitor this (on the elasticsearch side?)

I haven't used any of the monitoring API's in LS 5.0. Do you have any idea what API's would be the best to use to see what is happening on the input plugin?

Monitoring anything really.
How do you know it's LS or ES? Are you monitoring system resources?