Our index has a number of docs with bad ids. They were set by this script during a previous reindex:
def lowerCaseAuthor = ctx.author_string != null ? ctx.author_string.toLowerCase() : null;
def lowerCasePub = ctx.publication_fqdn != null ? ctx.publication_fqdn.toLowerCase() : null;
ctx._id = lowerCaseAuthor + lowerCasePub + ctx.collection_id
The problem is that some our author_string fields are very long. More than 256 characters long making the id fields longer than the 256 character limit.
Now that I am trying to reindex again, I am getting frequent errors trying to reindex.
"error": { -
"type": "action_request_validation_exception",
"reason": "Validation Failed: 1: id is too long, must be no longer than 512 bytes but was: 529;"
}
I had thought that maybe I could filter out the bad docs in the reindex. I came up with the query which seems to work fine as a standalone query:
{
"query": {
"bool" : {
"filter": {
"script": {
"script":"doc['_id'].value.length() > 256"
}
}
}
}
}
To count the number of docs with ids that are too long. But when I apply this query script to my reindex:
{
"source": {
"index": "testing",
"query": {
"bool": {
"filter": {
"script": {
"script": "doc['_id'].value.length() < 256"
}
}
}
}
},
"dest": {
"index": "testing_v6"
}
}
I am back to square one. It runs a bit and then I get the same validation error as above:
"error": { -
"type": "action_request_validation_exception",
"reason": "Validation Failed: 1: id is too long, must be no longer than 512 bytes but was: 529;"
}