Problem indexing a JSON file using Logstash to Elasticsearch?


(noah orj) #1

I have a file which contains a set of emails, each email is represented as a JSON object in that text file.

The file is available here http://jsonstudio.com/resources/

One JSON Text looks like this:

{ "_id" : { "$oid" : "52af48b7d55148fa0c19ad6b" }, "sender" : "kenneth.lay@enron.com", "recipients" : [ "k..allen@enron.com", "sally.beck@enron.com", "tim.belden@enron.com", "bob.butts@enron.com", "f..calger@enron.com", "rebecca.carter@enron.com", "wes.colwell@enron.com", "bill.cordes@enron.com", "shawn.cumberland@enron.com", "joseph.deffner@enron.com", "j..detmering@enron.com", "rich.dimichele@enron.com", "keith.dodson@enron.com", "jeff.donahue@enron.com", "david.duran@enron.com", "fernley.dyson@enron.com", "connie.estrems@enron.com", "john.gillis@enron.com", "joe.gold@enron.com", "robert.hayes@enron.com", "robert.hermann@enron.com", "gary.hickerson@enron.com", "michael.hutchinson@enron.com", "charlene.jackson@enron.com", "j.kaminski@enron.com", "joe.kishkill@enron.com", "richard.lewis@enron.com", "a..lindholm@enron.com", "phil.lowry@enron.com", "george.mcclellan@enron.com", "tom.mckeever@enron.com", "rob.milnthorp@enron.com", "kristina.mordaunt@enron.com", "mark.muller@enron.com", "julia.murray@enron.com", "greg.piper@enron.com", "james.prentice@enron.com", "brian.redmond@enron.com", "paula.rieker@enron.com", "richard.shapiro@enron.com", "vicki.sharp@enron.com", "rex.shelby@enron.com", "colleen.sullivan@enron.com", "mitch.taylor@enron.com", "elizabeth.tilney@enron.com", "adam.umanoff@enron.com", "rob.walls@enron.com", "george.wasaff@enron.com" ], "cc" : [], "text" : "\tAs announced earlier, we will be bringing all Managing Directors together, on a quarterly basis.  Please hold open the first Monday of every quarter (from 8:30 a.m. to 12:00 p.m.) for this purpose.  However, our first meeting will be on Tuesday, October 2nd.  I look forward to seeing you there.", "mid" : "7330421.1075852816313.JavaMail.evans@thyme", "fpath" : "enron_mail_20110402/maildir/lay-k/sent_items/9.", "bcc" : [], "to" : [ "k..allen@enron.com", "sally.beck@enron.com", "tim.belden@enron.com", "bob.butts@enron.com", "f..calger@enron.com", "rebecca.carter@enron.com", "wes.colwell@enron.com", "bill.cordes@enron.com", "shawn.cumberland@enron.com", "joseph.deffner@enron.com", "j..detmering@enron.com", "rich.dimichele@enron.com", "keith.dodson@enron.com", "jeff.donahue@enron.com", "david.duran@enron.com", "fernley.dyson@enron.com", "connie.estrems@enron.com", "john.gillis@enron.com", "joe.gold@enron.com", "robert.hayes@enron.com", "robert.hermann@enron.com", "gary.hickerson@enron.com", "michael.hutchinson@enron.com", "charlene.jackson@enron.com", "j.kaminski@enron.com", "joe.kishkill@enron.com", "richard.lewis@enron.com", "a..lindholm@enron.com", "phil.lowry@enron.com", "george.mcclellan@enron.com", "tom.mckeever@enron.com", "rob.milnthorp@enron.com", "kristina.mordaunt@enron.com", "mark.muller@enron.com", "julia.murray@enron.com", "greg.piper@enron.com", "james.prentice@enron.com", "brian.redmond@enron.com", "paula.rieker@enron.com", "richard.shapiro@enron.com", "vicki.sharp@enron.com", "rex.shelby@enron.com", "colleen.sullivan@enron.com", "mitch.taylor@enron.com", "elizabeth.tilney@enron.com", "adam.umanoff@enron.com", "rob.walls@enron.com", "george.wasaff@enron.com" ], "replyto" : null, "ctype" : "text/plain; charset=us-ascii", "fname" : "9.", "date" : "2001-08-24 06:08:29-07:00", "folder" : "sent_items", "subject" : "Executive Committee" }

My conf file for logstash looks like this

input {
    file {
    path => "/home/eddin/backupLog/*.log"
    start_position => "beginning"
    }
}
filter {
    json {
        source => "message"
    }
}
output {
    elasticsearch {
        hosts => ["131.159.30.3:9200"]
        index => "emails_samples"
        document_type => "emailsLog"
    }
 }

Investigating the log file for Logstash reveals no problem with the conf file, it recognizes the right file and it reads the messages in them ( I can see that if I use the output config rubydebug...), it connects to Elasticseach, no problems.

When I try to retrieve the docs under that index in elasticsearch I get only 1 Doc. Although I have hundreds of messages in that file to be parsed by logstash, and logstash parses them all (inspectings logstash log file), only one message get indexed by elasticsearch. What is my problem?


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.