We have a series of automated tasks that update different ES indexes with data. One piece of data is the people information that we also store in ES. What we are doing currently is using a scroll request to read in all the person information that is needed for that given task as an in memory cache. Sometimes all of these tasks run at the same time because of a server restart, or an upgrade to the task harness that manages all the tasks.
What we are seeing is that multiple scroll requests against the same index seem to go significantly slower. A single scroll request would normally be about a minute when no others are running. If that same request was then ran at the same time as other similar requests (somewhere around 3-6 additional requests) against that index it takes closer to 11 minutes for each to fully complete. Marvel hasn't shown that we are maxing out our resources on the ES server. I'm trying to figure out if there is something other than a scroll request that should be used to accomplish this, or a more efficient way to get the data we need. Keep in mind we are still on 1.x for ES as we haven't been able to move to the latest version of ES. So, if there are improvements around this in the later version then please include that in your answer as it would help me convince the team to upgrade sooner rather than later. The amount of data is roughly 2.4 million documents in the index.