Trouble Reindexing Child Documents with join field in ES 5.6


(Struve) #1

My issue is similar to this closed topic. I am trying to reindex parent/child documents in ES 5.6 in preparation for an ES 6.x upgrade. I was able to reindex the parent documents without a problem, however when I try to reindex the child documents I keep getting the error [routing] is missing for join field

Below is the code I am using to make the reindex call and as you can see the parent is being specified. I would also like to note that I am sure the mappings are set up correctly bc I was able to index child documents and they were properly joined to their parents. Reindexing is the only place I am having an issue. I think I might be missing something in the script but I am not sure what. Any help would be greatly appreciated.

POST _reindex
{
  "source": {
    "index":"development",
    "type": "vuln",
    "query": {
      "match_all": {}
    }
  },
  "dest": {
    "index": "a_mapping_test"
  },
  "script": {
    "source": "ctx._type='doc';ctx._source.asset_vuln_join=['name':'vuln', 'parent': ctx._parent];ctx.remove('_parent');"
  }
}

Error Response

{
      "index": "a_mapping_test",
      "type": "doc",
      "id": "123",
      "cause": {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse",
        "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "[routing] is missing for join field [asset_vuln_join]"
        }
      },
      "status": 400
    }

An routing missing exception is obtained when reindex sets the routing value
(Abdon Pijpelink) #2

You need to set ctx._routing in your script. Set it equal to ctx._parent (before you remove it).


(Struve) #3

Tried that as well and I still get the same error. I tried setting it to _parent like below and I also tried setting it to _routing since that is already set in my current index. Both result in the same error as before.

  "script": {
    "source": "ctx._routing=ctx._parent;ctx._type='doc';ctx._source.asset_vuln_join=['name':'vulnerability', 'parent': ctx._parent];ctx.remove('_parent');"
  }

(Abdon Pijpelink) #4

I'm wondering if some of your vuln type documents in the development index are missing the _parent field. That would explain this error. Can you post the document with the ID 123 that's causing the error that you are seeing?


(Struve) #5

Checked the docs in the development index and they all have parent and routing set correctly. I changed the id to 123 for simplicity but here is the actual doc and the error message together.

# Vuln Doc
{"_index"=>"development",
 "_type"=>"vuln",
 "_id"=>"18305772197390840",
 "_version"=>1,
 "_routing"=>"18305769100814537",
 "_parent"=>"18305769100814537",
 "found"=>true,
 "_source"=> ...

# ERROR
 {
      "index": "a_mapping_test",
      "type": "doc",
      "id": "18305772197390840",
      "cause": {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse",
        "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "[routing] is missing for join field [asset_vuln_join]"
        }
      },
      "status": 400
    },

(Abdon Pijpelink) #6

I think you may be running into this issue: https://github.com/elastic/elasticsearch/issues/26183

As a workaround, try giving the parent documents a different ID when you reindex those, for example parent_18305769100814537 instead of just 18305769100814537. Next, change your script to use that new ID to route the child documents:

"script": {
    "source": "ctx._routing='parent_'+ ctx._parent;ctx._type='doc';ctx._source.asset_vuln_join=['name':'vulnerability', 'parent': 'parent_' + ctx._parent];ctx.remove('_parent');"
  }

I think this will resolve the problem.


(Struve) #7

Yep, that did the trick! Now I just need to figure out the best way to flip back to the original IDs once everything is reindexed since update_by_query doesn't work with _routing, whomp whomp.

Thanks for the quick answers!!!


(Abdon Pijpelink) #8

You could alway reindex the documents again, to another index, and doing so flip back the original parent IDs.


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.