Logstash breaking my data structure

Hello,

I have defined my own index in elasticsearch

PUT donor
{
"mappings": {
   "properties": {
{...}
"donations": {"type": "nested"}
}

Basically I have a donor user in mysql and the donor has many donations, I have built a json array of object donations, and I wish to feed all this data to elasticsearch through logstash. Mysql builds the json array perfectly, however, when logstash does the query to send it to elastic, weird stuff occurs and I get this:

{:status=>400, :action=>["index", {:_id=>"137347", :_index=>"donors", :routing=>nil}, { "donations"=>"[\"{\\\"donation_id\\\": 13378812, \\\"donation_type_id\\\": 4, \\\"donation_center_id\\\": 1, \\\"donation_status_id\\\": 1, \\\"donation_sponsor_id\\\": 334},{\\\"donation_id\\\": 33432523, \\\"donation_type_id\\\": 4, \\\"donation_center_id\\\": 2, \\\"donation_status_id\\\": 1, \\\"donation_sponsor_id\\\": 300}\"]"}], :response=>{"index"=>{"status"=>400, "error"=>{"type"=>"document_parsing_exception", "reason"=>"[1:36] object mapping for [donations] tried to parse field [donations] as object, but found a concrete value"}}}}

Why is it finding a concrete value? If I manually post my result from mysql through Kibana it works...
Sample result as returned by mysql:

[{"donation_id": 13334300, "donation_type_id": 2, "donation_center_id": 2, "donation_status_id": 3, "donation_sponsor_id": 393}, {"donation_id": 33388011, "donation_type_id": 4, "donation_center_id": 3, "donation_status_id": 3, "donation_sponsor_id": 387}]

The logstash config is pretty bare bones, no filters, no processing no nothing. Input mysql, output elasticsearch. That is all

The TLDR is, I want logstash to not do anything to my entity, just pass it through as is. Why is it so hard? I have been stuck on this issue for hours. Please help.

Hi @joe_El_Yahchouchi

The error likely indicates that Elasticsearch is receiving the donations field as a JSON string instead of an array of objects. This usually happens because Logstash is not correctly deserializing the JSON data received from MySQL.

Try to handle this in the Logstash configuration file, as suggested below:

input {
  jdbc {
    jdbc_connection_string => "jdbc:mysql://localhost:3306/your_database"
    jdbc_user => "your_user"
    jdbc_password => "your_password"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    statement => "SELECT id, name, donations FROM donors"
  }
}

filter {
  mutate {
    # If necessary, convert the donations field to a string
    convert => { "donations" => "string" }
  }

  json {
    # Parse the donations field as JSON
    source => "donations"
    target => "donations"
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "donors"
    document_id => "%{id}"
  }
}

This configuration should ensure that Logstash correctly parses the donations field as a JSON array of objects before sending it to Elasticsearch.

1 Like

Hello, thank you for the reply, but no this did not work. Still getting the same error

{"donations"=>["{\"donation_id\": 1, \"donation_type_id\": 1, \"donation_center_id\": 1, \"donation_status_id\": 2, \"donation_sponsor_id\": 128},{\"donation_id\": 20053712, \"donation_type_id\": 3, \"donation_center_id\": 2, \"donation_status_id\": 1, \"donation_sponsor_id\": 135}"], "@version"=>"1", "prefix"=>"", "@timestamp"=>2024-08-08T13:02:00.512527998Z,}], :response=>{"index"=>{"status"=>400, "error"=>{"type"=>"document_parsing_exception", "reason"=>"[1:15] object mapping for [donations] tried to parse field [null] as object, but found a concrete value"}

There are fewer escape slashes now. Are these a factor at all? It's also throwing a quote around my brackets { I'm going to see about trying to remove those.

Can you add a file output in your logstash pipeline and share some sample of what is the output you are getting from Logstash?

From what you shared it looks that the donations fields is not being parsed correctly, which is probably because its value is not recognized as a valid json.

You can add a file output with this:

file {
    paht => "/tmp/sample-output.log"
}

Then get some sample lines and share them.

2 Likes

Hello, it ended up working. I fiddled a bit with the spacing inside the mysql query in the file itself and that somehow had an impact?
And thank you for teaching me about file output, very handy.

Thank you all, this can now be closed.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.