Logstash/Elasticsearch beginner: Can't get File plugin to work. What am I doing wrong?


(Marius Mathisen) #1

I try to run a JSON file trough Logstash to index in Elasticsearch, but I can't seem to get it indexed. I don't get any error messages or anything. It says: "Logstash startup complete".

I have a conf file called "logstash-simple.conf" and it has this content:

input { 

  file {
       path => "/Server.json"
      }

}
output {
  elasticsearch { 
   host => "localhost" 
  index => "Server"
 }
}

And this is what I get in terminal:

Mariuss-MacBook-Pro:logstash-1.5.2 mariusmathisen$ bin/logstash -f logstash-simple.conf
jul 21, 2015 9:43:34 PM org.elasticsearch.node.internal.InternalNode <init>
INFO: [logstash-Mariuss-MacBook-Pro.local-9045-13456] version[1.5.1], pid[9045],        build[5e38401/2015-04-09T13:41:35Z]
jul 21, 2015 9:43:34 PM org.elasticsearch.node.internal.InternalNode <init>
INFO: [logstash-Mariuss-MacBook-Pro.local-9045-13456] initializing ...
jul 21, 2015 9:43:34 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [logstash-Mariuss-MacBook-Pro.local-9045-13456] loaded [], sites []
jul 21, 2015 9:43:37 PM org.elasticsearch.node.internal.InternalNode <init> 
INFO: [logstash-Mariuss-MacBook-Pro.local-9045-13456] initialized
jul 21, 2015 9:43:37 PM org.elasticsearch.node.internal.InternalNode start
INFO: [logstash-Mariuss-MacBook-Pro.local-9045-13456] starting ...
jul 21, 2015 9:43:37 PM org.elasticsearch.transport.TransportService doStart
INFO: [logstash-Mariuss-MacBook-Pro.local-9045-13456] bound_address     {inet[/0:0:0:0:0:0:0:0:9301]}, publish_address {inet[/10.0.1.6:9301]}
jul 21, 2015 9:43:37 PM org.elasticsearch.discovery.DiscoveryService doStart
INFO: [logstash-Mariuss-MacBook-Pro.local-9045-13456] elasticsearch/OWDC2YR-  S066maKkSwbavQ
 jul 21, 2015 9:43:40 PM org.elasticsearch.cluster.service.InternalClusterService$UpdateTask     run
 INFO: [logstash-Mariuss-MacBook-Pro.local-9045-13456] detected_master [Karla Sofen]     [21jpaJ5_R3yRIEg8-jdcsw][Mariuss-MacBook-Pro.local][inet[/10.0.1.6:9300]], added {[Karla   Sofen][21jpaJ5_R3yRIEg8-jdcsw][Mariuss-MacBook-Pro.local][inet[/10.0.1.6:9300]],}, reason:    zen-disco-receive(from master [[Karla Sofen][21jpaJ5_R3yRIEg8-jdcsw][Mariuss-MacBook-   Pro.local][inet[/10.0.1.6:9300]]])
 jul 21, 2015 9:43:40 PM org.elasticsearch.node.internal.InternalNode start
INFO: [logstash-Mariuss-MacBook-Pro.local-9045-13456] started
Logstash startup completed

Here is an example of the content in the JSON file (machineID and name is changed for security reasons):

{
 "machineID": 111111111,
"isVirtualMachine": true,
"name": "WHATEVER12",
"lastActivity": "2015-07-21T13:33:33",
"domain": "NO",
"deviceOS": "Microsoft Windows Server 2012 Standard",
"lastBoot": "2015-07-16T03:31:07",
"patchePlan": "Auto"
}, {
"machineID": 22222222,
"isVirtualMachine": true,
"name": "WHATEVER50",
"lastActivity": "2015-07-21T13:22:05",
"domain": "NO",
"deviceOS": "Microsoft Windows Server 2008 R2 Standard",
"lastBoot": "2015-07-16T22:00:36",
"patchePlan": "Auto"
}, {
"machineID": 333333333,
"isVirtualMachine": false,
"name": "WHATEVER02",
"lastActivity": "2015-07-21T13:19:43",
"domain": "NO",
"deviceOS": "Microsoft Windows Server 2012 R2 Standard",
"lastBoot": "2015-07-16T03:22:56",
"patchePlan": "Auto"
}

When I try to check in Kibana, Marvel or Head there is nothing indexed at all. Also, when I just try to check with a file output nothing happens.

So what am I doing wrong here? I have tried to read up on this, but I can't see where there is an error.


(Magnus Bäck) #2

By default Logstash tails files and won't read them from the beginning. Try appending something to the file and see if it's picked up.

Secondly, if you want JSON files to be parsed and processed in a sane way you'll want to set codec => json or codec => json_lines for the input. However, I suspect Logstash won't be able to parse this either way since your file is a comma-separated list of JSON objects. You'll be fine with

{ ... }

or

{
  ...
}
{
  ...
}

but I doubt

{
  ...
}, {
  ...
}

is acceptable.


(Marius Mathisen) #3

I changed the conf file to:

input { 
file {
codec => "json"
start_position => "beginning"
path => "/Servers.json"

 }

}
 output {
  elasticsearch { 
   host => "localhost" 
   index => "Server"
 }
}

And I removed the commas between the objects in the JSON file, but still nothing. What am I missing? Is it possible to have some debugging info shown?


(Magnus Bäck) #4

And I removed the commas between the objects in the JSON file, but still nothing. What am I missing?

start_position => "beginning" only makes a difference for new files. Your Servers.json isn't new. To have it processed from the top either delete its sincedb file entry when Logstash is stopped or recreate the file:

cp Servers.json Servers.json.new && mv Servers.json.new Servers.json

Is it possible to have some debugging info shown?

Starting Logstash with --verbose or --debug results in more logs.

I also suggest that you take Elasticsearch out of the picture for now and just use a stdout output (preferably with codec => rubydebug). Once you've confirmed that things are working you can switch back.


(Marius Mathisen) #5

The file is'nt new? What do you mean? Do I have to use "beginning" at all?


(Magnus Bäck) #6

Since Logstash has been monitoring the file before it already has a current position recorded (namely, at the end of the file). Changing the configuration to start_position => "beginning" at this point won't make a difference; that setting only matters for "unseen" files that don't have their position recorded in a sincedb file.


(Marius Mathisen) #7

But should it be so complicated to feed Logstash with a JSON file that pushes a parsed result to ElasticSearch? Could you give a simple example on how you would do the same? Just a simple JSON file with multiple objects that you want to output to ElasticSearch?


(Magnus Bäck) #8

The file input is meant to be used for tailing log files that change over time. This makes it less than ideal for one-shot processing of a file that doesn't change, for reasons that you've discovered.

If you recreate the file like I showed you yesterday Logstash should pick it up from the beginning (provided that you also set start_position => "beginning".

Another option is to use the stdin input and feed the file via stdin. If the processing is interrupted Logstash won't be able to pick up where it left off, but apart from that caveat it's much better suited for one-shot processing.


(Marius Mathisen) #9

If it's not to much to ask; Could you create a complete example on how you would set up the conf file to process a JSON file?


(Magnus Bäck) #10
input {
  file {
    path => ["filename.json"]
    start_position => "beginning"
    codec => "json"
  }
}

The json_lines codec might be a better fit, I'm not sure. I've never tried to parse files with JSON messages spread over multiple lines.

And again, the example above assumes that the file in question is unknown to Logstash.


(Marius Mathisen) #11

Why do you use brackets? [ ]

So if I just rename the file, Elastic will consider it a new file?

And how about output to Elasticsearch?


(Magnus Bäck) #12

Why do you use brackets? [ ]

Because the path parameter is an array that can contain multiple filename patterns. I think plain strings are supported too but I prefer being consistent and not relying on undocumented behavior.

So if I just rename the file, Elastic will consider it a new file?

No, a file retains its inode number when renamed (within the same file system). Logstash's filewatch library uses the device and inode numbers to track files and how much of them has been read. To make a file appear new to Logstash, make a copy of it.

And how about output to Elasticsearch?

The configuration you posted earlier looks okay (depending on how you've configured ES). Once you've verified (with a stdout output) that you're reading messages correctly you can reenable the elasticsearch output.


(system) #13