Index json file into Elasticsearch


(Alex) #1

Hello,

I'm using a basic logstash conf to index a json file to Elasticsearch :

    input {
  file {
    path => "C:\Users\imadd\OneDrive\Bureau\ebusiness.json"
	sincedb_path => "null"
	type => "json"
	codec => "json"
  }
}

filter {
	json {
		source => "[message]"
		remove_field => ["[message]"]
	}
}

output {
elasticsearch {
    hosts => ["localhost:9200"]
	index => "logstash-%{+YYYY.MM.dd}"
  }
  stdout {
        codec => rubydebug
    }
  }

Nothing happens when I run Logstash conf :

Sending Logstash logs to C:/dev/tools/logstash-6.5.1/logstash-6.5.1/logs which is now configured via log4j2.properties
[2018-11-21T14:32:58,822][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-11-21T14:32:58,848][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.5.1"}
[2018-11-21T14:33:02,647][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-11-21T14:33:03,025][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2018-11-21T14:33:03,034][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[2018-11-21T14:33:03,170][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2018-11-21T14:33:03,232][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-11-21T14:33:03,236][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}
[2018-11-21T14:33:03,269][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}
[2018-11-21T14:33:03,291][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2018-11-21T14:33:03,310][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2018-11-21T14:33:03,686][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x94c2edd run>"}
[2018-11-21T14:33:03,743][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2018-11-21T14:33:03,759][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2018-11-21T14:33:04,056][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

Json file :

{"@timestamp":1542724255466,"@version":1,"message":"a","logger_name":"a","thread_name":"a","level":"WARN","level_value":0,"LH-Correlation-ID":"a","caller_class_name":"a","caller_method_name":"a","caller_file_name":"a","caller_line_number":0,"appender_name":"a","hostname":"aa","docker":{"container_id":"a"},"kubernetes":{"container_name":"a","namespace_name":"a","pod_name":"a","pod_id":"a","host":"a","master_url":"a","namespace_id":"a","labels":{"deployment":"a","deploymentconfig":"a","group":"a","project":"a","provider":"a","version":"a"}},"pipeline_metadata":{"collector":{"ipaddr4":"a","ipaddr6":"","inputname":"a","name":"a","received_at":"a","version":"a"}}}

I cannot find out my problem because I have no ERROR and the index not found on Kibana. Did someone have an idea??

Thank you


(Lewis Barclay) #2

Try changing the path to:

path => "C:/Users/imadd/OneDrive/Bureau/ebusiness.json"

And ensure that your user you are running logstash as has permission to read.


(Alex) #3

Thank you for your response. The user have all the rights....


(Lewis Barclay) #4

Did you change the path?


(Alex) #5

yes I tried different paths


(Lewis Barclay) #6

Does the logstash output change at all?


(Alex) #7

it doesn't change anything to the process, I have the same behavior.
Do you confirm my logstash configuration? is there a mistake I don't see?


(Jessé Peixoto) #10

Hi imaad, sorry about deleting the previous replies, I have made some tests and I believe I have found a solution.

You have to set the start_position parameter to "beginning" and also set the sincedb_path parameter to "/dev/null/".

The default behavior of start_position treats files like live streams and thus starts at the end. If you have old data you want to import, set this to beginning .

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html#plugins-inputs-file-start_position

Also, instead of

type => "json"

use

codec => "json"

In this case you would not use json filter.

Example:

input {
file {
path => "file.json"
start_position => "beginning"
sincedb_path => "/dev/null"
codec => "json"
}
}
output {
stdout { codec => "dots"}
elasticsearch {
hosts => ["server:9200"]
index => "json_index"
document_type => "_doc"
}
}

Finally, your @timestamp field will be renamed to _@timestamp and also tagged as _timestampparsefailure by the json codec because there is a default timestamp field in elasticsearch documents.


(Alex) #11

Hello Jessé,

Is this works for you, because I already tried the same thing but Logstash still not detect my Json file. I find this weird because I already use other input plugins like "elasticsearch plugin" and it work. But, with the file plugin input nothing happens.

Thank you


(Christian Dahlqvist) #12

On Windows I believe the sincedb path should be nul.

Also check whether your JSON objects in the log file are all written on a line each. If not, you may need to assemble related lines using a multiline codec.


#13

Hello,
you need to change

path => "C:\Users\imadd\OneDrive\Bureau\ebusiness.json"
sincedb_path => "null"

to

path => "C:/Users/imadd/OneDrive/Bureau/ebusiness.json"
sincedb_path => "nul"


(Alex) #14

@balumurari1 @Christian_Dahlqvist Thank you for your responses, I tried what you suggest and I still have the same behavior.

I just noticed that my Elasticsearch runs with this WARN :

[2018-11-22T11:07:52,515][WARN ][o.e.x.s.t.n.SecurityNetty4HttpServerTransport] [Flt4X4V] caught exception while handling client http traffic, closing connection [id: 0x28d856a3, L:/127.0.0.1:9200 - R:/127.0.0.1:54154]
java.io.IOException: Une connexion existante a d├╗ ├¬tre ferm├®e par lÔÇÖh├┤te distant
        at sun.nio.ch.SocketDispatcher.read0(Native Method) ~[?:?]
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43) ~[?:?]
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:?]
        at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[?:?]
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:?]
        at io.netty.buffer.PooledHeapByteBuf.setBytes(PooledHeapByteBuf.java:261) ~[netty-buffer-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1128) ~[netty-buffer-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:347) ~[netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148) [netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) [netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544) [netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) [netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) [netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897) [netty-common-4.1.30.Final.jar:4.1.30.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]

Do you think this can affect my configuration?


#15

try this,

input {
file {
type => "json"
path => "C:/Users/imadd/OneDrive/Bureau/ebusiness.json"
start_position => "beginning"
sincedb_path => "nul"
}
}

filter {
json {
source => "message"
}
}

output {
stdout { codec => rubydebug }
}


(Alex) #16

This is the output, again same behavior :

Sending Logstash logs to C:/dev/tools/logstash-6.5.1/logstash-6.5.1/logs which is now configured via log4j2.properties
[2018-11-22T11:38:47,015][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-11-22T11:38:47,035][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.5.1"}
[2018-11-22T11:38:50,387][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-11-22T11:38:50,913][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x6b2902bc run>"}
[2018-11-22T11:38:50,975][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2018-11-22T11:38:50,989][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2018-11-22T11:38:51,342][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

#17

can you give me the input code and the steps you were trying to execute.?
Also specify the version of elasticsearch, logstash you were using?


(Alex) #18

Logstash conf :

input {
file {
type => "json"
path => "C:/Users/imadd/OneDrive/Bureau/ebusiness.json"
start_position => "beginning"
sincedb_path => "nul"
}
}

filter {
json {
source => "message"
}
}

output {
stdout { codec => rubydebug }
} 

command line : logstash -f C:/dev/tools/logstash-6.5.1/logstash-6.5.1/confs/business.conf

Elasticsearch 6.5.1 / Logstash 6.5.1
Is that what you need?


#19

try this,

input
{
file
{
codec => multiline
{
pattern => '^{'
negate => true
what => previous
}
path => ["C:/Users/imadd/OneDrive/Bureau/ebusiness.json"]
start_position => "beginning"
sincedb_path => "nul"
exclude => "*.gz"
}
}

filter
{
mutate
{
replace => [ "message", "%{message}}" ]
gsub => [ 'message','\n','']
}
if [message] =~ /^{.*}$/
{
json { source => message }
}

}

output
{
stdout { codec => rubydebug }
}


(Alex) #20

same same.....


(Jessé Peixoto) #21

The file input requires each line to end with a newline character. This has already happened to me, without a new line character and a json with one line only, nothing happens. Maybe this is your problem.