Binary conversion error using Logstash to ElasticSearch


(Mark Austin) #1

I am trying to use jdbc-sql to write data from a SQL Server database into ElasticSearch. One of the columns in each of the two tables is a VARBINARY(MAX). These are failing conversion with an error;

[2017-01-12T11:51:30,022][WARN ][logstash.outputs.elasticsearch] Failed action.
{:status=>400, :action=>["index", {:_id=>"40993", :_index=>"audit-201701", :_typ
e=>"SystemAuditTable", :_routing=>nil}, 2017-01-12T11:51:28.538Z %{host} %{messa
ge}], :response=>{"index"=>{"_index"=>"audit-201701", "_type"=>"SystemAuditTable
", "_id"=>"40993", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [encryptedpatient]", "caused_by"=>{"type"=>"json_pars
e_exception", "reason"=>"Failed to decode VALUE_STRING as base64 (MIME-NO-LINEFE
EDS): Illegal character '\' (code 0x5c) in base64 content\n at [Source: org.ela
sticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@3eeec71a
; line: 1, column: 109]"}}}}}
[2017-01-12T11:51:30,022][WARN ][logstash.outputs.elasticsearch] Failed action.
{:status=>400, :action=>["index", {:_id=>"41362", :_index=>"audit-201701", :_typ
e=>"SystemAuditTable", :_routing=>nil}, 2017-01-12T11:51:29.054Z %{host} %{messa
ge}], :response=>{"index"=>{"_index"=>"audit-201701", "_type"=>"SystemAuditTable
", "_id"=>"41362", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [encryptedpatient]", "caused_by"=>{"type"=>"json_pars
e_exception", "reason"=>"Failed to decode VALUE_STRING as base64 (MIME-NO-LINEFE
EDS): Illegal character '\' (code 0x5c) in base64 content\n at [Source: org.ela
sticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@79b87c;
line: 1, column: 109]"}}}}}

The sql query is

SELECT TOP (50000)
[Id]
,[RevisionStamp]
,[Type] as SystemAuditType
,[Action]
,[IpAddress]
,[SectionId]
,[UserId]
,[PatientId]
,[TreatmentId]
,[Patient]
,CAST([EncryptedPatient] AS VARBINARY(MAX)) AS EncryptedPatient
FROM NewINRstarAudit.[dbo].[SystemAuditTable]

And the template for the index;

"SystemAuditTable":{  
  "_all":{  
    "enabled":false
  },
  "properties":{  
    "@timestamp":{  
      "type":"date"
    },
    "@version":{  
      "type":"text"
    },
    "type":{  
      "type":"text",
      "index":true
    },
    "id":{  
      "type":"long",
      "index":true
    },
    "revisionstamp":{  
      "type":"date",
      "index":true
    },
    "systemaudittype":{  
      "type":"text",
      "index":true
    },
    "action":{  
      "type":"text",
      "index":true
    },
    "ipaddress":{  
      "type":"text",
      "index":false
    },
    "sectionid":{  
      "type":"text",
      "index":true
    },
    "userid":{  
      "type":"text",
      "index":true
    },
    "patientid":{  
      "type":"text",
      "index":true
    },
    "treatmentid":{  
      "type":"text",
      "index":true
    },
    "encryptedpatient":{  
      "type":"binary",
      "store":true,
      "index":"false"
    }
  }
}

}

Thanks

Mark


(Christian Dahlqvist) #2

JSON does not allow raw binary data in fields, so if your field contains binary data I am not surprised it is causing an error. I think the usual way to get around this, if it has to be stored, is to transform it into a string using base64 encoding. This is the method used e.g. by the old mapper attachment plugin.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.