Object field starting or ending with a [.] makes object resolution ambiguous

I'm trying to deal with an index that the Upgrade Assistant has said needs to be reindexed before an upgrade to Elasticsearch 7. The index was created with 5.6.14 and the cluster currently runs 6.8.9. Due to an issue with one of the fields sometimes having a value that conflicts with the Boolean mapping specified for it in the index template, the index can't be reindexed using the reindex API. Instead I'm having Logstash read the index, modify the problem field where necessary, and then output everything to a new index. That works fine, but I've now encountered a problem with another field

[2020-08-27T14:08:17,001][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>"AWp5yExBr4bv6yv4hpHU", :_index=>"audit-reindex-v6-logstash-2019.05", :_type=>"_doc", :routing=>nil}, #<LogStash::Event:0x452f7372>], :response=>{"index"=>{"_index"=>"reindex-v6-logstash-2019.05", "_type"=>"_doc", "_id"=>"AWp5yExBr4bv6yv4hpHU", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"object field starting or ending with a [.] makes object resolution ambiguous: [.requestedDate]"}}}}}

I've encountered that error before but I'm not sure I understand it in this instance. Evidently 5.6.14 accepted the data. The document in question contains

"_source": {
    "event": {
      "year": 2019,
      "data": {
        "sessionData.sessions[0].requestedDate": "Mon 23 Sep 2019",
        "sessionData.sessions[0].bookingCapacity": "70",
        "sessionData.sessions[0].dayPart": "Afternoon",
        "sessionData.sessions[0].preferredLocation": "CentralCampus"

Is the problem that Elasticsearch interprets sessionData.sessions[0].requestedDate as sessions being an array and then the first element of that can't be an object with a key name of .requestedDate because field names can't start with . ?

Suggestions on how to make Logstash turn those strings in to something Elasticsearch will accept are welcome!

I think that is correct. I think this is the code. Early on elasticsearch allowed periods in field names, then it didn't for a while, then it started doing so again but added constraints, and I think you are hitting one of those.

I think you would have to adjust the sessionData.sessions array to contain hashes.

I've realised that isn't the problem. I've reindexed lots of indices that the Upgrade Assistant said need reindexing successfully using the Elasticsearch reindex API and I noticed that some of the indices created as a result contain fields called sessionData.sessions[0].requestedDate and similar. So Elasticsearch doesn't have a problem with such fieldnames, but Logstash (6.8.9) does.

The attempt to reindex the problem index I mention in original post using Logstash had successfully copied all except 26 documents to the new index. The 26 documents that didn't copy being the ones that contain fields with names like sessionData.sessions[0].requestedDate. I extracted the _id for these documents from the Logstash log and then used the Elasticsearch reindex API to copy just those 26 documents from the old index to the new index. Quite the hassle having to use a combination of both the reindex API and Logstash to get everything in to a new index, but it's done now!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.