Hi everybody
After search a lot about this problem I get the conclusion that this is a bug for logstash. I have asked in this forum and unfortunately I didn't got any answer, so I decided to give step by step the way to reproduce the bug. Please if this is not the right place to report a bug, tell me where.
In resum I get the bug when I try to import data from a table in postgresql, that have a field type json, to an index in elasticsearch using logstash. My aim will be to get a nested field in elasticsearch index using as source the field type json in the postgres table.
To reproduce the error, please following this next stepes:
-
create a table in postgresql:
CREATE TABLE test_jsonfield (
customer integer NOT NULL,
categories_json json
); -
insert 2 records in the table
INSERT INTO test_jsonfield VALUES (1, '[{"first_level":297,"second_level":null}]');
INSERT INTO test_jsonfield VALUES (2, '[{"first_level":585,"second_level":[1559,2445]},{"first_level":987,"second_level":[2]}]'); -
Create the logstash configuration file
input {
jdbc {
jdbc_connection_string => "jdbc:postgresql://mydomain:5432/mydatabase"
jdbc_user => "postgres"
jdbc_password => "mypassword"
jdbc_paging_enabled => true
jdbc_page_size => "50000"
jdbc_validate_connection => true
jdbc_driver_library => "/usr/share/elasticsearch/lib/postgresql-9.4.1208.jar"
jdbc_driver_class => "org.postgresql.Driver"
statement => "SELECT * FROM test_jsonfield"
}
}
filter {
json {
source => "categories_json"
target => "categories"
remove_field => ["categories_json"]
}
}
output {
elasticsearch {
document_id => "%{customer}"
index => "test_jsonfield_nested"
document_type => "test"
}
}
-
Create the mapping for the index "test_jsonfield_nested"
POST test_jsonfield_nested/
{
"mappings": {
"test": {
"properties": {
"customer": {
"type": "string"
},
"categories": {
"type": "nested",
"properties": {
"first_level": {
"type": "integer"
},
"second_level": {
"type": "integer"
}
}
}
}
}
}
} -
Check the mapping:
{
"test_jsonfield_nested": {
"mappings": {
"test": {
"properties": {
"categories": {
"type": "nested",
"properties": {
"first_level": {
"type": "integer"
},
"second_level": {
"type": "integer"
}
}
},
"customer": {
"type": "string"
}
}
}
}
}
} -
run logstash
sh logstash -f test_jsonfield.conf
in this point I get the following Errors:
Settings: Default pipeline workers: 3
Pipeline main started
Error parsing json {:source=>"categories_json", :raw=>#Java::OrgPostgresqlUtil::PGobject:0x62f1e143, :exception=>java.lang.ClassCastException: org.jruby.java.proxies.ConcreteJavaProxy cannot be cast to org.jruby.RubyIO, :level=>:warn}
Error parsing json {:source=>"categories_json", :raw=>#Java::OrgPostgresqlUtil::PGobject:0x66c5829a, :exception=>java.lang.ClassCastException: org.jruby.java.proxies.ConcreteJavaProxy cannot be cast to org.jruby.RubyIO, :level=>:warn}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}
The data is imported but not as a nested field.
I have expected a field name "categories" as a nested field, according to the configuration of filter "json" using in the config file "test_jsonfield.conf" of logstash and the mapping of the index.
Instead of them I get a docu with a field called "categories_json(the same field that have in the postgres table) , something like this:
"_source": {
"customer": 2,
"categories_json": {
"type": "json",
"value": "[{"first_level":585,"second_level":[1559,2445]},{"first_level":987,"second_level":[2]}]"
},
In advance thanks for your support.
Regards
Jorge von Rudno