Help! I can't filter out docs with bad ids in reindex

Our index has a number of docs with bad ids. They were set by this script during a previous reindex:

def lowerCaseAuthor = ctx.author_string != null ? ctx.author_string.toLowerCase() : null;
def lowerCasePub = ctx.publication_fqdn != null ? ctx.publication_fqdn.toLowerCase() : null;
ctx._id =  lowerCaseAuthor + lowerCasePub + ctx.collection_id

The problem is that some our author_string fields are very long. More than 256 characters long making the id fields longer than the 256 character limit.

Now that I am trying to reindex again, I am getting frequent errors trying to reindex.

"error": { - 
    "type": "action_request_validation_exception",
    "reason": "Validation Failed: 1: id is too long, must be no longer than 512 bytes but was: 529;"
  }

I had thought that maybe I could filter out the bad docs in the reindex. I came up with the query which seems to work fine as a standalone query:

{
    "query": {
    "bool" : {
        "filter": {
            "script": {
                "script":"doc['_id'].value.length() > 256"
            }
         }
        }
    }
}

To count the number of docs with ids that are too long. But when I apply this query script to my reindex:

{
  "source": {
    "index": "testing",
    "query": {
      "bool": {
        "filter": {
          "script": {
            "script": "doc['_id'].value.length() < 256"
          }
        }
      }
    }
  },
  "dest": {
    "index": "testing_v6"
  }
}

I am back to square one. It runs a bit and then I get the same validation error as above:

"error": { - 
    "type": "action_request_validation_exception",
    "reason": "Validation Failed: 1: id is too long, must be no longer than 512 bytes but was: 529;"
  }

Please don't create multiple topics on the same thing - Cannot reindex due to bad id field(s)

Feel free to bump your original topic and add a bit more info to it though :slight_smile: