Using avro codec to convert from JSON to avro

(Guido) #1

Hey all,

I was trying to convert the file format from json to avro.
I was able to do that by using the following configuration.

json sample:


test.avsc file:

"namespace": "sample",
"type": "record",
"name": "event",
"fields": [
{"name": "id", "type": "int"},
{"name": "first_name", "type": "string"},
{"name": "last_name", "type": "string"},
{"name": "email", "type": "string"},
{"name": "gender", "type": "string"},
{"name": "ip_address", "type": "string"}

Logstash configuration:

input {
file {
path => "/somewhere/test.json"
start_position => "beginning"
sincedb_path => "/dev/null"
codec => "json"
output {
file {
path => "/somewhere/test.avro"
codec => avro {
schema_uri => "/somwhere/test.avsc"

All these seemed to be working fine because I can see the test.avro file as the result, the problem is that I'm not able to get the schema, I got an IOException related to not a data file.

java -jar ~/Downloads/avro/jar_files/avro-tools-1.8.2.jar getschema out/test.avro
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See for more info.
Exception in thread "main" Not a data file.
at org.apache.avro.file.DataFileStream.initialize(
at org.apache.avro.file.DataFileReader.(
at org.apache.avro.tool.Main.main(

Am I missing something?
Btw, is it possible to manage the size of the output files?



(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.