Logstash elastic search output with very big JSON is not outputting


#1

Hi,

I have very big nested json (22KB) which I am trying to output to elastic search but it's not outputting. Output to file works as expected. I tried with small json which works. Is there any limitation on json filter?

Please suggest what should I do here! Thanks in advance.


(Magnus Bäck) #2

Have you checked your Logstash logs for errors or other clues?


#3

Hi @magnusbaeck, I don't see any errors in the logs.. Any guess? I am on 5.5.0...


(Magnus Bäck) #4
codec => "json"

Remove this.


#5

Hi @magnusbaeck, I removed json codec but its still not outputting to elastic search...

And logstash is not flushing the events until I stop.. Here is the log:

[2017-09-25T13:04:25,175][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
^C[2017-09-25T13:04:43,107][WARN ][logstash.runner ] SIGINT received. Shutting down the agent.
[2017-09-25T13:04:43,124][WARN ][logstash.agent ] stopping pipeline {:id=>"main"}
[2017-09-25T13:04:48,114][WARN ][logstash.runner ] Received shutdown signal, but pipeline is still waiting for in-flight events
to be processed. Sending another ^C will force quit Logstash, but this may cause
data loss.
[2017-09-25T13:04:48,134][WARN ][logstash.shutdownwatcher ] {"inflight_count"=>1, "stalling_thread_info"=>{["LogStash::Filters::Mutate", {"remove_field"=>["prefix", "filename", "message"], "id"=>"98f03816aae7e170d420da8b2950e9f999380aa9-6"}]=>[{"thread_id"=>43, "name"=>"[main]>worker10", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/manticore-0.6.1-java/lib/manticore/response.rb:50:in `call'"}]}}
[2017-09-25T13:04:48,136][ERROR][logstash.shutdownwatcher ] The shutdown process appears to be stalled due to busy or blocked plugins. Check the logs for more information.


(Magnus Bäck) #6

Are you getting anything to your file or stdout outputs? Have you verified that your multiline configuration works? What does the input look like?


#7

Thanks @magnusbaeck. My input looks like this (seperated by \n) -


#8

One issue I figured out. auto_flush_interval is missing. Not sure what is the issue with other config...


#9

How to specify the multi line pattern when \n is at the end of line?

I configured like below and I see it's looking at starting of the line so my output is missing last flower bracket...

pattern => "^\n"


#10

I am still having issues with 43KB json not outputting to elastic search though its valid.. What should I do to index this big document? Logstash did not complain and I don't see errors in elastic search logs also...


(Magnus Bäck) #11

So your input files contain multiple multiline JSON structures, separated by blank lines? First of all I don't think \n in the pattern makes any sense. If you want to match a blank line use ^$. Secondly I suspect using what => "previous" might be a better idea.


(Christian Dahlqvist) #12

Even though the last document may be valid JSON, it can not be indexed into Elasticsearch as it contains fields with conflicting mappings. You should be seeing an error in Logstash about this.

If you can not find this , you can however also try indexing a single document using curl, and this should give you an error about mapping conflict.

I have not gone through the full document, but can clearly see that at least the type field towards the end is mapped both as a string and as an object, which is not allowed in Elasticsearch. Every field must have a single mapping thoughout the entire index.


#13

Thanks @Christian_Dahlqvist Because it is nested json, are you saying that all properties under the same parent node should be of similar type? Can you please point me to documentation?


(system) #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.