Trouble Reindexing Child Documents with join field in ES 5.6

My issue is similar to this closed topic. I am trying to reindex parent/child documents in ES 5.6 in preparation for an ES 6.x upgrade. I was able to reindex the parent documents without a problem, however when I try to reindex the child documents I keep getting the error [routing] is missing for join field

Below is the code I am using to make the reindex call and as you can see the parent is being specified. I would also like to note that I am sure the mappings are set up correctly bc I was able to index child documents and they were properly joined to their parents. Reindexing is the only place I am having an issue. I think I might be missing something in the script but I am not sure what. Any help would be greatly appreciated.

POST _reindex
{
  "source": {
    "index":"development",
    "type": "vuln",
    "query": {
      "match_all": {}
    }
  },
  "dest": {
    "index": "a_mapping_test"
  },
  "script": {
    "source": "ctx._type='doc';ctx._source.asset_vuln_join=['name':'vuln', 'parent': ctx._parent];ctx.remove('_parent');"
  }
}

Error Response

{
      "index": "a_mapping_test",
      "type": "doc",
      "id": "123",
      "cause": {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse",
        "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "[routing] is missing for join field [asset_vuln_join]"
        }
      },
      "status": 400
    }

You need to set ctx._routing in your script. Set it equal to ctx._parent (before you remove it).

Tried that as well and I still get the same error. I tried setting it to _parent like below and I also tried setting it to _routing since that is already set in my current index. Both result in the same error as before.

  "script": {
    "source": "ctx._routing=ctx._parent;ctx._type='doc';ctx._source.asset_vuln_join=['name':'vulnerability', 'parent': ctx._parent];ctx.remove('_parent');"
  }

I'm wondering if some of your vuln type documents in the development index are missing the _parent field. That would explain this error. Can you post the document with the ID 123 that's causing the error that you are seeing?

Checked the docs in the development index and they all have parent and routing set correctly. I changed the id to 123 for simplicity but here is the actual doc and the error message together.

# Vuln Doc
{"_index"=>"development",
 "_type"=>"vuln",
 "_id"=>"18305772197390840",
 "_version"=>1,
 "_routing"=>"18305769100814537",
 "_parent"=>"18305769100814537",
 "found"=>true,
 "_source"=> ...

# ERROR
 {
      "index": "a_mapping_test",
      "type": "doc",
      "id": "18305772197390840",
      "cause": {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse",
        "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "[routing] is missing for join field [asset_vuln_join]"
        }
      },
      "status": 400
    },

I think you may be running into this issue: https://github.com/elastic/elasticsearch/issues/26183

As a workaround, try giving the parent documents a different ID when you reindex those, for example parent_18305769100814537 instead of just 18305769100814537. Next, change your script to use that new ID to route the child documents:

"script": {
    "source": "ctx._routing='parent_'+ ctx._parent;ctx._type='doc';ctx._source.asset_vuln_join=['name':'vulnerability', 'parent': 'parent_' + ctx._parent];ctx.remove('_parent');"
  }

I think this will resolve the problem.

1 Like

Yep, that did the trick! Now I just need to figure out the best way to flip back to the original IDs once everything is reindexed since update_by_query doesn't work with _routing, whomp whomp.

Thanks for the quick answers!!!

You could alway reindex the documents again, to another index, and doing so flip back the original parent IDs.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.