I've examined all the questions on this subject. None seems to address the problem I'm having.
I need to loop through doing multiple delete_by_queries. As I've set things up for experimenting, just a handful. The problem I keep getting, intermittently, is with the field deleted
in the response object.
I know for a fact that each query should result in deletion of multiple documents (Lucene documents) from the index: between 10 such LDocs and several hundred LDocs. Thus, deleted
in the response should never be 0 in what follows:
#[derive(Deserialize, Debug)]
struct DeleteResponse {
took: usize,
// in this field the ES server tells you how many LDocs have been deleted:
deleted: usize,
total: usize,
batches: usize,
...
}
let url = format!("{ES_URL}/{}/_delete_by_query?refresh=true", self.index_name);
for _ in 0..x {
...
let text_document = ... // obtain from map
...
let data: serde_json::Value = json!({
"query": {
"match": {
"text_ldoc_number": text_document.text_doc_ldoc_number,
}}});
// NB as stated, the above always matches between 10 and several hundred LDocs
// allowing the possibility of trying several times
let delete_attempts = 5;
for i in 0..delete_attempts {
let delete_response: DeleteResponse = reqwest_call!(&url, reqwest::Method::POST, body_str=&data.to_string())?;
if delete_response.deleted == 0 {
if i == delete_attempts - 1 {
return Err(anyhow!("deletion of LDocs with text_doc_ldoc_number {text_doc_ldoc_number} resulted in 0 LDocs being deleted after 5 attempts"))
}
// not the final try: sleep thread for 20 ms
let wait_time_ms = time::Duration::from_millis(20);
thread::sleep(wait_time_ms);
}
else {
// deleted == non-zero: this operation has worked OK: leave "tries" loop
break
}
}
}
As can be seen, I've tried tacking on "?refresh=true" to the URL. In fact this doesn't seem to make much difference.
Setting the millisecond value as above (20 ms) I find that the deleted
value is sometimes 0 and sometimes not. Setting to a lower ms value tends to produce more 0 values. But the behaviour is very unstable: with some runs I can have no failures, and everything just works on the first try.
This sort of pragmatic setting of sleep ms value is obviously unsatisfactory. I want to find out the actual reasons behind these (silent) intermittent failures of my ES instruction. It appears ES may be needing time to "digest" these deletions before moving on to the next delete_by_query operation. Can I detect when this "digestion" has ended?
NB sometimes the very first delete_by_query operation fails on the first attempt. Occasionally the very first operation fails 5 times and thus raises the Err
. So it appears that before running any delete_by_query a check needs to be made that the index and server are in a particular "settled/receptive" state...