I have a daily need to retrieve all documents from a particular index. I am using ScrollAll (below) but have 2 issues.
-
The number of docs in this index is currently static, but I receive varying doc counts when I run this code. The count in Kibana is constant. I really need this to be reliable.
-
The speed is slow, using 12 slices it takes 10+ minutes to get 1.6 Mil docs. I only need an Id field in the doc, but I haven't seen a way to get just that, if that would help. Any optimization advice for this would be appreciatred.
`List resp = new List();
var eClient = ThirdPartyService.ElasticClient();
// number of slices in slice scroll
var numberOfSlices = 12;
var scrollObserver = eClient.ScrollAll<ThirdPartyProgram>("15m", numberOfSlices, s => s
.MaxDegreeOfParallelism(numberOfSlices)
.Search(search => search
.Preference("iva")
.Query(q => q
.Match(m => m
.Field(f => f.Source)
.Query(value)
)
)
)
).Wait(TimeSpan.FromMinutes(15), r =>
{
r.SearchResponse.Documents.ForEach(x => resp.Add(x.SourceId));
});
var u = resp.DistinctBy(x => x).ToList();
return resp;`