Performance degradation after upgrading to ES 2.2.0

Hi Folks,

I've created new cluster based on ES 2.2.0, Logstash 2.2.0. Logstash input kafka and output elasticsearch.
Cluster built on top of AWS 10 data nodes r3.2xlarge with 3TB data_dir and 12 logstash indexers with local elasticsearch (no_data and no_master) on c3.large instances. All app server are producing logs to kafka and indexers are consuming them into ES. ES cluster indexing messages about 20% slower than app server are pushing them to kafka. Daily EC cluster receiving about 1.5 billion messages.
Data node load is about 30%, iowait is 15%, io util is about 30%-50%.
Same setup on ES 1.7 worked well.

Logstash conf:
** input {**
** kafka {**
** topic_id => "apache-accesslog"**
** zk_connect => "zookeeper1a:2181,zookeeper1b:2181,zookeeper1c:2181/kafka"**
** queue_size => 3000**
** group_id => "logstash_indexer_apache_accesslog"**
** }**
** }**
** filter {**
** if "metric" in [tags] or "_bad_clientip" in [tags] {**
** drop {}**
** }**
** }**
** output {**
** if "apache-accesslog" in [tags] {**
** elasticsearch {**
** hosts => ["localhost:9200"]**
** index => "apache-accesslog-%{+YYYY.MM.dd}"**
** flush_size => 3000**
** workers => 3**
** }**
** }**
** }**

Elasticsearch on logstash indexer node:

cluster.name: production.eu-west-1
node.name: logstash-i-95e52a1f.a.production.eu-west-1
node.max_local_storage_nodes: 1
path.conf: /etc/elasticsearch
path.data: /usr/share/elasticsearch
path.logs: /var/log/elasticsearch
network.host: eth0:ipv4,local
http.port: 9200
discovery.zen.ping.multicast.enabled: true
action.destructive_requires_name: true
cloud.node.auto_attributes: true
cloud.aws.access_key: somekey
cloud.aws.region: eu-west-1
cloud.aws.secret_key: somekey
cluster.routing.allocation.awareness.attributes: zone
cluster.routing.allocation.awareness.force.zone.values: spot,ondemand
discovery.ec2.any_group: true
discovery.ec2.availability_zones: eu-west-1a,eu-west-1b,eu-west-1c
discovery.ec2.host_type: private_ip
discovery.ec2.ping_timeout: 10s
discovery.ec2.tag.elasticsearch_cluster: production.eu-west-1
discovery.type: ec2
gateway.expected_nodes: 0
node.data: false
node.master: false
node.zone: spot

Logstash and ES are getting each 40% of the memory on the instance.

Kafka input are working very fast, about 6k events/s (used metrics filter to measure). However es output is very slow.

index mapping config:

{
** "order": 0,**
** "template": "apache-accesslog-*",**
** "settings": {**
** "index": {**
** "routing": {**
** "allocation": {**
** "total_shards_per_node": "4"**
** }**
** },**
** "cache": {**
** "field": {**
** "type": "soft"**
** },**
** "filter": {**
** "expire": "30m"**
** }**
** },**
** "refresh_interval": "30s",**
** "number_of_shards": "8",**
** "translog": {**
** "flush_threshold_ops": "20000"**
** },**
** "auto_expand_replicas": "false",**
** "query": {**
** "default_field": "uri"**
** },**
** "store": {**
** "throttle": {**
** "type": "node",**
** "max_bytes_per_sec": "100mb"**
** }**
** },**
** "number_of_replicas": "1"**
** }**
** },**
** "mappings": {**
** "default": {**
** "_source": {**
** "enabled": true**
** },**
** "dynamic_templates": [**
** {**
** "integers": {**
** "mapping": {**
** "index": "not_analyzed",**
** "type": "integer",**
** "doc_values": true**
** },**
** "match_mapping_type": "integer"**
** }**
** },**
** {**
** "strings": {**
** "mapping": {**
** "index": "not_analyzed",**
** "type": "string",**
** "doc_values": true**
** },**
** "match_mapping_type": "string"**
** }**
** }**
** ],**
** "_all": {**
** "enabled": false**
** }**

Changes I did during investigation was adding batch_size,workers to logstash process; workers and flush_size to output/ queue_size to input. Workers made it working faster, but than I started to get 429 error from logstash.

Any input or suggestions will be appreciated!