So I'm new to our dev team and I'm trying to trouble shoot our ElasticSearch Performance issues.
I noticed that our data retrieval from the ES instance takes 800ms -1200ms for a result-set as small as 1000 records. I've never seen ES behaving so slowly on a search. Indexing performance is even worse.
I looked at our cluster (which has 4 m4.large.elasticsearch nodes) and realized that we have 70+ indices in our production instance (majority of which are out of date and for back up purposes only). But we have 300+ active primary shards and 700+ active shards when our biggest index doesn't have data bigger than 2 gigs. Each of these indices are set up with a default of 5 shards and 1 replica.
I'm trying to understand (and need some expert opinion so I can go back to my team with findings) if the total number of shards (700+) over the number of nodes (4) will result in the horrible performance we are seeing, and just how bad is the health our cluster (even though the monitor says green).