I am using dev consule of kibana to modify elasticsearch index.
My request reads the data of a field and writes it in another field of the same index. The size of the index is 20 milion documents.
This is my script:
POST /vjdb2/_update_by_query?scroll=5m&timeout=5m
{
"conflicts": "proceed",
"script": {
"source": """
if (ctx._source.containsKey('date') && ctx._source.collection_date == null) {
String dateValue = ctx._source.date;
if (dateValue =~ /^\d{4}$/) {
// Single year, convert to YYYY-01-01
ctx._source.collection_date = dateValue + '-01-01';
} else if (dateValue =~ /^\d{4}-\d{2}$/ || dateValue =~ /^\d{4}-\d{2}-\d{2}$/) {
// Valid YYYY-MM or YYYY-MM-DD, copy directly
ctx._source.collection_date = dateValue;
}
}
""",
"lang": "painless"
},
"query": {
"bool": {
"must": {
"exists": {
"field": "date"
}
},
"must_not": {
"exists": {
"field": "collection_date"
}
}
}
},
"size": 1000
}
this code execute properly and after aproximately one minute, it stops and I got this error:
{
"statusCode": 502,
"error": "Bad Gateway",
"message": "Client request timeout"
}
I tried to fix it by change in the "Refresh frequency" of the consule, add slice method, use Batches and scrolling during the insertion. I noticed each time I execute the code it is working until terminates by the error (probably timeout).
For example, the number of the docs left to be corrected in a certain time was "7154828" and after one time of the execution it reduced to "7054943".
FYI: here is the code to check the number of the remaining docs.
POST /vjdb2/_search?scroll=1m
{
"_source": ["date", "collection_date"],
"query": {
"bool": {
"must_not": {
"exists": {
"field": "collection_date"
}
},
"must": {
"exists": {
"field": "date"
}
}
}
},
"track_total_hits": "true",
"size": 1000
}
PS: the curl command of the same code from the same instance is fine and finishes without any error. How can I fix it?