Binary conversion error using Logstash to ElasticSearch

I am trying to use jdbc-sql to write data from a SQL Server database into ElasticSearch. One of the columns in each of the two tables is a VARBINARY(MAX). These are failing conversion with an error;

[2017-01-12T11:51:30,022][WARN ][logstash.outputs.elasticsearch] Failed action.
{:status=>400, :action=>["index", {:_id=>"40993", :_index=>"audit-201701", :_typ
e=>"SystemAuditTable", :_routing=>nil}, 2017-01-12T11:51:28.538Z %{host} %{messa
ge}], :response=>{"index"=>{"_index"=>"audit-201701", "_type"=>"SystemAuditTable
", "_id"=>"40993", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [encryptedpatient]", "caused_by"=>{"type"=>"json_pars
e_exception", "reason"=>"Failed to decode VALUE_STRING as base64 (MIME-NO-LINEFE
EDS): Illegal character '\' (code 0x5c) in base64 content\n at [Source: org.ela
sticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@3eeec71a
; line: 1, column: 109]"}}}}}
[2017-01-12T11:51:30,022][WARN ][logstash.outputs.elasticsearch] Failed action.
{:status=>400, :action=>["index", {:_id=>"41362", :_index=>"audit-201701", :_typ
e=>"SystemAuditTable", :_routing=>nil}, 2017-01-12T11:51:29.054Z %{host} %{messa
ge}], :response=>{"index"=>{"_index"=>"audit-201701", "_type"=>"SystemAuditTable
", "_id"=>"41362", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [encryptedpatient]", "caused_by"=>{"type"=>"json_pars
e_exception", "reason"=>"Failed to decode VALUE_STRING as base64 (MIME-NO-LINEFE
EDS): Illegal character '\' (code 0x5c) in base64 content\n at [Source: org.ela
sticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@79b87c;
line: 1, column: 109]"}}}}}

The sql query is

SELECT TOP (50000)
[Id]
,[RevisionStamp]
,[Type] as SystemAuditType
,[Action]
,[IpAddress]
,[SectionId]
,[UserId]
,[PatientId]
,[TreatmentId]
,[Patient]
,CAST([EncryptedPatient] AS VARBINARY(MAX)) AS EncryptedPatient
FROM NewINRstarAudit.[dbo].[SystemAuditTable]

And the template for the index;

"SystemAuditTable":{  
  "_all":{  
    "enabled":false
  },
  "properties":{  
    "@timestamp":{  
      "type":"date"
    },
    "@version":{  
      "type":"text"
    },
    "type":{  
      "type":"text",
      "index":true
    },
    "id":{  
      "type":"long",
      "index":true
    },
    "revisionstamp":{  
      "type":"date",
      "index":true
    },
    "systemaudittype":{  
      "type":"text",
      "index":true
    },
    "action":{  
      "type":"text",
      "index":true
    },
    "ipaddress":{  
      "type":"text",
      "index":false
    },
    "sectionid":{  
      "type":"text",
      "index":true
    },
    "userid":{  
      "type":"text",
      "index":true
    },
    "patientid":{  
      "type":"text",
      "index":true
    },
    "treatmentid":{  
      "type":"text",
      "index":true
    },
    "encryptedpatient":{  
      "type":"binary",
      "store":true,
      "index":"false"
    }
  }
}

}

Thanks

Mark

JSON does not allow raw binary data in fields, so if your field contains binary data I am not surprised it is causing an error. I think the usual way to get around this, if it has to be stored, is to transform it into a string using base64 encoding. This is the method used e.g. by the old mapper attachment plugin.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.