I have few documents in a folder and I want to check if all the documents in this folder are indexed or not. To do so, for each document name in the folder, I would like to run through a loop for the documents indexed in ES and compare. So I want to retrieve all the documents.
There are few other possible duplicates of the same question in SO, but they didn't help me as the documentation has changed .(there is nothing about scan in the current documentation)
I tried using client.search<T>() . But as per the documentation, a default number of 10 results are retrieved. I would like to get all the records without mentioning the size of records ? (Because the size of the index changes)
Or is it possible to get the size of the index first and then send this number as input to the size to get all the documents and loop through?
Is there any tutorial or sample example in NEST available for the same? Because I tried to do using scroll. Once I get a scrollId, how do I run another search query using the scrollId?
for eg: var response = client.Search<Document>(s=>s.Scroll("2m")); --> Retrieves only 10 documents
from this response, I got response.scrollId: "Ascsjdbgkjabgkakjfgsadhvkjag"
How do I proceed from here? I just want to add all the filenames in a list.
List<string> indexedList = new List<string>();
var scanResults = client.Search<ClassName>(s => s
.From(0)
.Size(2000)
.MatchAll()
.Fields(f=>f.Field(fi=>fi.propertyName)) //I used field to get only the value I needed rather than getting the whole document
.SearchType(Elasticsearch.Net.SearchType.Scan)
.Scroll("5m")
);
var results = client.Scroll<ClassName>("10m", scanResults.ScrollId);
while (results.Documents.Any())
{
foreach(var doc in results.Fields)
{
indexedList.Add(doc.Value<string>("propertyName"));
}
results = client.Scroll<ClassName>("10m", results.ScrollId);
}
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.