Index json file into Elasticsearch

Hello,

I'm using a basic logstash conf to index a json file to Elasticsearch :

    input {
  file {
    path => "C:\Users\imadd\OneDrive\Bureau\ebusiness.json"
	sincedb_path => "null"
	type => "json"
	codec => "json"
  }
}

filter {
	json {
		source => "[message]"
		remove_field => ["[message]"]
	}
}

output {
elasticsearch {
    hosts => ["localhost:9200"]
	index => "logstash-%{+YYYY.MM.dd}"
  }
  stdout {
        codec => rubydebug
    }
  }

Nothing happens when I run Logstash conf :

Sending Logstash logs to C:/dev/tools/logstash-6.5.1/logstash-6.5.1/logs which is now configured via log4j2.properties
[2018-11-21T14:32:58,822][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-11-21T14:32:58,848][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.5.1"}
[2018-11-21T14:33:02,647][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-11-21T14:33:03,025][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2018-11-21T14:33:03,034][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[2018-11-21T14:33:03,170][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2018-11-21T14:33:03,232][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-11-21T14:33:03,236][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}
[2018-11-21T14:33:03,269][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}
[2018-11-21T14:33:03,291][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2018-11-21T14:33:03,310][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2018-11-21T14:33:03,686][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x94c2edd run>"}
[2018-11-21T14:33:03,743][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2018-11-21T14:33:03,759][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2018-11-21T14:33:04,056][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

Json file :

{"@timestamp":1542724255466,"@version":1,"message":"a","logger_name":"a","thread_name":"a","level":"WARN","level_value":0,"LH-Correlation-ID":"a","caller_class_name":"a","caller_method_name":"a","caller_file_name":"a","caller_line_number":0,"appender_name":"a","hostname":"aa","docker":{"container_id":"a"},"kubernetes":{"container_name":"a","namespace_name":"a","pod_name":"a","pod_id":"a","host":"a","master_url":"a","namespace_id":"a","labels":{"deployment":"a","deploymentconfig":"a","group":"a","project":"a","provider":"a","version":"a"}},"pipeline_metadata":{"collector":{"ipaddr4":"a","ipaddr6":"","inputname":"a","name":"a","received_at":"a","version":"a"}}}

I cannot find out my problem because I have no ERROR and the index not found on Kibana. Did someone have an idea??

Thank you

Try changing the path to:

path => "C:/Users/imadd/OneDrive/Bureau/ebusiness.json"

And ensure that your user you are running logstash as has permission to read.

Thank you for your response. The user have all the rights....

Did you change the path?

yes I tried different paths

Does the logstash output change at all?

it doesn't change anything to the process, I have the same behavior.
Do you confirm my logstash configuration? is there a mistake I don't see?

Hi imaad, sorry about deleting the previous replies, I have made some tests and I believe I have found a solution.

You have to set the start_position parameter to "beginning" and also set the sincedb_path parameter to "/dev/null/".

The default behavior of start_position treats files like live streams and thus starts at the end. If you have old data you want to import, set this to beginning .

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html#plugins-inputs-file-start_position

Also, instead of

type => "json"

use

codec => "json"

In this case you would not use json filter.

Example:

input {
file {
path => "file.json"
start_position => "beginning"
sincedb_path => "/dev/null"
codec => "json"
}
}
output {
stdout { codec => "dots"}
elasticsearch {
hosts => ["server:9200"]
index => "json_index"
document_type => "_doc"
}
}

Finally, your @timestamp field will be renamed to _@timestamp and also tagged as _timestampparsefailure by the json codec because there is a default timestamp field in elasticsearch documents.

1 Like

Hello Jessé,

Is this works for you, because I already tried the same thing but Logstash still not detect my Json file. I find this weird because I already use other input plugins like "elasticsearch plugin" and it work. But, with the file plugin input nothing happens.

Thank you

On Windows I believe the sincedb path should be nul.

Also check whether your JSON objects in the log file are all written on a line each. If not, you may need to assemble related lines using a multiline codec.

1 Like

Hello,
you need to change

path => "C:\Users\imadd\OneDrive\Bureau\ebusiness.json"
sincedb_path => "null"

to

path => "C:/Users/imadd/OneDrive/Bureau/ebusiness.json"
sincedb_path => "nul"

1 Like

@balumurari1 @Christian_Dahlqvist Thank you for your responses, I tried what you suggest and I still have the same behavior.

I just noticed that my Elasticsearch runs with this WARN :

[2018-11-22T11:07:52,515][WARN ][o.e.x.s.t.n.SecurityNetty4HttpServerTransport] [Flt4X4V] caught exception while handling client http traffic, closing connection [id: 0x28d856a3, L:/127.0.0.1:9200 - R:/127.0.0.1:54154]
java.io.IOException: Une connexion existante a d├╗ ├¬tre ferm├®e par lÔÇÖh├┤te distant
        at sun.nio.ch.SocketDispatcher.read0(Native Method) ~[?:?]
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43) ~[?:?]
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:?]
        at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[?:?]
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:?]
        at io.netty.buffer.PooledHeapByteBuf.setBytes(PooledHeapByteBuf.java:261) ~[netty-buffer-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1128) ~[netty-buffer-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:347) ~[netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148) [netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) [netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544) [netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) [netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) [netty-transport-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897) [netty-common-4.1.30.Final.jar:4.1.30.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]

Do you think this can affect my configuration?

try this,

input {
file {
type => "json"
path => "C:/Users/imadd/OneDrive/Bureau/ebusiness.json"
start_position => "beginning"
sincedb_path => "nul"
}
}

filter {
json {
source => "message"
}
}

output {
stdout { codec => rubydebug }
}

This is the output, again same behavior :

Sending Logstash logs to C:/dev/tools/logstash-6.5.1/logstash-6.5.1/logs which is now configured via log4j2.properties
[2018-11-22T11:38:47,015][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-11-22T11:38:47,035][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.5.1"}
[2018-11-22T11:38:50,387][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-11-22T11:38:50,913][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x6b2902bc run>"}
[2018-11-22T11:38:50,975][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2018-11-22T11:38:50,989][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2018-11-22T11:38:51,342][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

can you give me the input code and the steps you were trying to execute.?
Also specify the version of elasticsearch, logstash you were using?

Logstash conf :

input {
file {
type => "json"
path => "C:/Users/imadd/OneDrive/Bureau/ebusiness.json"
start_position => "beginning"
sincedb_path => "nul"
}
}

filter {
json {
source => "message"
}
}

output {
stdout { codec => rubydebug }
} 

command line : logstash -f C:/dev/tools/logstash-6.5.1/logstash-6.5.1/confs/business.conf

Elasticsearch 6.5.1 / Logstash 6.5.1
Is that what you need?

try this,

input
{
file
{
codec => multiline
{
pattern => '^{'
negate => true
what => previous
}
path => ["C:/Users/imadd/OneDrive/Bureau/ebusiness.json"]
start_position => "beginning"
sincedb_path => "nul"
exclude => "*.gz"
}
}

filter
{
mutate
{
replace => [ "message", "%{message}}" ]
gsub => [ 'message','\n','']
}
if [message] =~ /^{.*}$/
{
json { source => message }
}

}

output
{
stdout { codec => rubydebug }
}

same same.....

The file input requires each line to end with a newline character. This has already happened to me, without a new line character and a json with one line only, nothing happens. Maybe this is your problem.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.