Concurrent indexing and searching can easily saturate 1000 IOPs. Why not use instance-attached storage instead?
You can reduce merging by increasing the indexing buffer (default 10% of heap), and use only as many client-side indexing threads, and as many shards, as necessary to achieve your required indexing throughput. For example, in the extreme of 1 shard on the node and 1 client-side indexing thread, ES/Lucene would only ever write a single segment to disk the size of your indexing buffer, greatly reducing the later merge load.
You can also force ES to use only a single thread for merging. This may cause merges to fall behind in your use case, eventually forcing ES to throttle incoming indexing, but it should reduce the worst case IOPs.
That said, searching is potentially much more IOPS heavy than merging since it requires random access to e.g. the terms dictionary, stored fields. Merging is actually "best case" for the way AWS counts IOPS (see AWS IOPs) because it is a sequential read of all files being merged, and a sequential write of the files for the merged segment.
But still, that's a 2X cost (reading then writing). For SSD backed EBS it's 256 KB max single IOP size, 1000 IOPS gets you ~128 MB/sec writing throughput, which is in fact not that much for fastish CPUs.
Also, make sure your OS has plenty of free ram for IO caching. Set your JVM's heap to the smallest size that is needed, and leave the rest of the RAM to the OS.