JSON formatted log contains "message" field confusing Logstash

We have servers which are writing their logs out in JSON format. An example:

{"message":{"description":"System error, contact application support team","exception":"java.util.NoSuchElementException","user_agent":"Jodd HTTP","code":"SYSTEM_ERROR","stacktrace":"[java.util.LinkedList.removeFirst(LinkedList.java:268), java.util.LinkedList.pop(LinkedList.java:799), com.company.i360.core.drools.DroolRule.pop(DroolRule.java:350), com.company.i360.core.drools.DroolsService.getSession(DroolsService.java:212), com.company.cloud.edge.rules.v1.EdgeRulesLogic.applyRules(EdgeRulesLogic.java:36), com.company.cloud.edge.rules.v1.EdgeRulesWebServiceV1.eventRules(EdgeRulesWebServiceV1.java:38), sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source), sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43), java.lang.reflect.Method.invoke(Method.java:606), com.company.i360.core.web.task.WebTask.execute(WebTask.java:288), com.company.i360.core.web.Resolver.executeWebServiceTask(Resolver.java:148), com.company.i360.core.web.Resolver.resolve(Resolver.java:69), com.company.i360.core.web.MainServlet.service(MainServlet.java:144), javax.servlet.http.HttpServlet.service(HttpServlet.java:731), org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303), org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208), org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52), org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241), org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208), org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:218), org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122), org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505), org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169), org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103), org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:956), org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116), org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:442), org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1082), org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:623), org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316), java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145), java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615), org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61), java.lang.Thread.run(Thread.java:745)]","type":"runtime","url":"/cloud-edge/action/edge-rules-v1/event-rules"},"timestamp":"2016-11-04 16:21:33,584", "loggerName":"sys.web", "level":"ERROR", "threadName":"7631082047"

Since the line already has the "message" field defined it seems like logstash is having issues - deciphering what it is supposed to do. If I add a JSON filter for the message field then LS has trouble saving the contents saying the field "description" is unknown - " "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [message]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"unknown property [description]"}}}}, :level=>:warn}"

How do I get LS to realize the "message" is not the whole message so it can evaluate the line properly?

What do the mappings of the ES index look like? Have you disabled dynamic mapping?

THe mappings are being created dynamically - they're huge because of the inconsistencies of the data being written to them (lots of possible formats).

Any suggestions here? I've upgraded my logstash boxes to see if that helps with the handling. When I get the following payload:
{ "message": { "description": "System error, contact application support team", "exception": "java.util.NoSuchElementException", "user_agent": "Jodd HTTP", "code": "SYSTEM_ERROR", "stacktrace": "[java.util.LinkedList.removeFirst(LinkedList.java:268), java.util.LinkedList.pop(LinkedList.java:799), com.gilbarco.i360.core.drools.DroolRule.pop(DroolRule.java:350), com.gilbarco.i360.core.drools.DroolsService.getSession(DroolsService.java:212), com.gilbarco.cloud.edge.rules.v1.EdgeRulesLogic.applyRules(EdgeRulesLogic.java:36), com.gilbarco.cloud.edge.rules.v1.EdgeRulesWebServiceV1.eventRules(EdgeRulesWebServiceV1.java:38),

I can't post the error code because this crappy support site won't let me post LOGSTASH's OWN ERROR OUTPUT to your site.
The error message does not help at all. I am stuck and have no idea what to do to troubleshoot this further. Can someone please suggest SOMETHING I can do to get past this? I am going to have to rip out ELK if I can't find a fix, having "most" of my logs does me no good.

I can't post the error code because this crappy support site won't let me post LOGSTASH's OWN ERROR OUTPUT to your site.

How so? If you mark it as preformatted text it should work just fine.

The problem could be that message is mapped as a string but you're trying to send an object. That's why I asked for the current mappings. You don't need to post all of them if they're huge but I'd like to see the message mapping.

I did some reading and (I think) I understand the issue now. If i change the message's type to "object" will that handle nested JSON as well? Or do I need to define it at each level?

No, setting message to object should be enough provided that you haven't disabled dynamic mapping (it's enabled by default).

Thank you for your help, this is quite frustrating. Here is a (somewhat abbreviated) example line:

{:timestamp=>"2017-01-19T19:17:57.992000+0000", :message=>"Failed action. ", :status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-dev-2017.01.19", :_type=>"log4j-json", :_routing=>nil}, #<LogStash::Event:0x3a2d595f @metadata_accessors=#<LogStash::Util::Accessors:0x21e65740 @store={}, @lut={}>, @cancelled=false, @data={"message"=>{"description"=>"System error, contact application support team", "exception"=>"java.util.NoSuchElementException", "user_agent"=>"Jodd HTTP", "code"=>"SYSTEM_ERROR", "stacktrace"=>"[java.util.LinkedList.removeFirst(LinkedList.java:268), java.util.LinkedList.pop,org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61), java.lang.Thread.run(Thread.java:745)]", "type"=>"runtime", "url"=>"/cloud-edge/action/edge-rules-v1/event-rules"}, "@version"=>"1", "@timestamp"=>"2017-01-19T19:17:50.832Z", "file"=>"/var/log/tomcat7/gilbarco-cloud-edge.json", "host"=>"rm-cloud-edge-ip-10-4-29-176", "type"=>"log4j-json", "host_type"=>"rm-cloud-edge-dev", "timestamp"=>"2017-01-19 19:17:50,832", "loggerName"=>"sys.web", "level"=>"ERROR", "threadName"=>"6688592347"}, "type"]}>>], :response=>{"create"=>{"_index"=>"logstash-dev-2017.01.19", "_type"=>"log4j-json", "_id"=>"AVm4KgJIdJJ5v8lo274g", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [message]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"unknown property [description]"}}}}, :level=>:warn}
My filter is simple -

if [type] == "log4j-json" {
  json  {
        source => "message"
        }

date {
locale => en
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss,SSS" , "ISO8601" ]
  }
 }

Oh and the mappings:
{ "logstash-dev-2017.01.19" : { "mappings" : { "log4j-json" : { "_all" : { "enabled" : true, "omit_norms" : true }, "dynamic_templates" : [ { "message_field" : { "mapping" : { "fielddata" : { "format" : "disabled" }, "index" : "analyzed", "omit_norms" : true, "type" : "string" }, "match" : "message", "match_mapping_type" : "string" } }, { "string_fields" : { "mapping" : { "fielddata" : { "format" : "disabled" }, "index" : "analyzed", "omit_norms" : true, "type" : "string", "fields" : { "raw" : { "ignore_above" : 256, "index" : "not_analyzed", "type" : "string" } } }, "match" : "*", "match_mapping_type" : "string" } } ], "properties" : { "@timestamp" : { "type" : "date", "format" : "strict_date_optional_time||epoch_millis" }, "@version" : { "type" : "string", "index" : "not_analyzed" }, "file" : { "type" : "string", "norms" : { "enabled" : false }, "fielddata" : { "format" : "disabled" }, "fields" : { "raw" : { "type" : "string", "index" : "not_analyzed", "ignore_above" : 256 } } }, "geoip" : { "dynamic" : "true", "properties" : { "ip" : { "type" : "ip" }, "latitude" : { "type" : "float" }, "location" : { "type" : "geo_point" }, "longitude" : { "type" : "float" } } }, "host" : { "type" : "string", "norms" : { "enabled" : false }, "fielddata" : { "format" : "disabled" }, "fields" : { "raw" : { "type" : "string", "index" : "not_analyzed", "ignore_above" : 256 } } }, "host_type" : { "type" : "string", "norms" : { "enabled" : false }, "fielddata" : { "format" : "disabled" }, "fields" : { "raw" : { "type" : "string", "index" : "not_analyzed", "ignore_above" : 256 } } }, "level" : { "type" : "string", "norms" : { "enabled" : false }, "fielddata" : { "format" : "disabled" }, "fields" : { "raw" : { "type" : "string", "index" : "not_analyzed", "ignore_above" : 256 } } }, "loggerName" : { "type" : "string", "norms" : { "enabled" : false }, "fielddata" : { "format" : "disabled" }, "fields" : { "raw" : { "type" : "string", "index" : "not_analyzed", "ignore_above" : 256 } } }, "message" : { "type" : "string", "norms" : { "enabled" : false }, "fielddata" : { "format" : "disabled" } }, "threadName" : { "type" : "string", "norms" : { "enabled" : false }, "fielddata" : { "format" : "disabled" }, "fields" : { "raw" : { "type" : "string", "index" : "not_analyzed", "ignore_above" : 256 } } }, "timestamp" : { "type" : "string", "norms" : { "enabled" : false }, "fielddata" : { "format" : "disabled" }, "fields" : { "raw" : { "type" : "string", "index" : "not_analyzed", "ignore_above" : 256 } } }, "type" : { "type" : "string", "norms" : { "enabled" : false }, "fielddata" : { "format" : "disabled" }, "fields" : { "raw" : { "type" : "string", "index" : "not_analyzed", "ignore_above" : 256 } } } } } } } }

Well, message is mapped as a string and you need to do something about that. I would've expected a different error message, but let's start in that end anyway.

Ok, I've gotten some time to work on this and the craziness continues. Still struggling with (I think) keeping everything unique. So for most of my logs I have the following (were host_type is set by the logstash-forwarder with the name of the Tomcat application the host is running).

if [type] == "log4j-json" {
 json	{
  source => "message"
  target => "log_data"	
  }
 
 date {
   locale => en
   match => [ "[log_data][timestamp]" , "yyyy-MM-dd HH:mm:ss,SSS" , "ISO8601" ]
  }

mutate {
  remove_field => [ "message" ]
  rename => { "log_data" => "%{host_type}" }
  }
}

The incoming files are JSON formatted. Sort of (valid JSON but they are ugly inside) for example for the host_type "rm-notification-ws-dev-dev" (don't ask):

{"message":"action=process_notification_messages , payload={\"id\":\"AVnwGgYUCMuYDvqxa8YD\",\"deliveryType\":\"PUSH-NOTIFICATION\",\"notificationType\":\"EVENT-SUBSCRIPTION\",\"userId\":\"gus@ail.com\",\"userHash\":\"e5f358e424bf321f8ac5ae786d2907be\",\"notificationTimestamp\":\"2017-01-30T15:59:14.441Z\",\"payload\":{\"operation\":\"probe-sample-event\",\"siteDetail\":{\"siteId\":108465,\"address1\":\"125 POWDER FOREST DRIVE\",\"siteName\":\"TLechak Office TLS-450P\"},\"deviceDetail\":{\"deviceType\":\"TLS350\",\"deviceTypeDescription\":\"TLS350\",\"deviceId\":\"1244922\"}},\"notificationCategory\":\"info\"}","timestamp":"2017-01-30 15:59:14,950","loggerName":"sys.consumer","level":"INFO","threadName":"ActiveMQ Session Task-11898"}

This works as expected (hurray).

But other applications, with very similar formatting fail (host_type:

{"message":{"description":"Package not found","exception":"com.gilbarco.i360.core.error.ApplicationException: TASK_NOT_FOUND: TASK_NOT_FOUND - null","user_agent":"Jodd","code":"TASK_NOT_FOUND","type":"checked","url":"/passport-core/action/scheduled-task-v1/schedule-callback"},"timestamp":"2017-01-30 04:29:00,180","loggerName":"sys.web","level":"WARN","threadName":"2417383241"} 

I cannot figure out what is going on here. It never accepts this line - there is no mapping for the second host_type's messages - it's not taking any of them! Here is what logstash says:

{:timestamp=>"2017-01-30T16:06:30.309000+0000", :message=>"Failed action. ", :status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-dev-2017.01.30", :_type=>"log4j-json", :_routing=>nil}, #<LogStash::Event:0x6efd0f07 @metadata_accessors=#<LogStash::Util::Accessors:0x4f52bb36 @store={}, @lut={}>, @cancelled=false, @data={"@version"=>"1", "@timestamp"=>"2017-01-30T04:29:00.180Z", "file"=>"/var/log/tomcat7/gilbarco-passport-core.json", "host"=>"i360-core-ip-10-4-29-236", "host_type"=>"i360-core-dev", "type"=>"log4j-json", "i360-core-dev"=>{"message"=>{"description"=>"Package not found", "exception"=>"com.gilbarco.i360.core.error.ApplicationException: TASK_NOT_FOUND: TASK_NOT_FOUND - null", "user_agent"=>"Jodd HTTP", "code"=>"TASK_NOT_FOUND", "type"=>"checked", "url"=>"/passport-core/action/scheduled-task-v1/schedule-callback"}, "timestamp"=>"2017-01-30 04:29:00,180", "loggerName"=>"sys.web", "level"=>"WARN", "threadName"=>"2417383241"}}, @metadata={}, @accessors=#<LogStash::Util::Accessors:0x583a16e6 @store={"@version"=>"1", "@timestamp"=>"2017-01-30T04:29:00.180Z", "file"=>"/var/log/tomcat7/gilbarco-passport-core.json", "host"=>"i360-core-ip-10-4-29-236", "host_type"=>"i360-core-dev", "type"=>"log4j-json", "i360-core-dev"=>{"message"=>{"description"=>"Package not found", <snip>}, @lut={"i360-core-dev"=>[{"@version"=>"1", "@timestamp"=>"2017-01-30T04:29:00.180Z", "file"=>"/var/log/tomcat7/gilbarco-passport-core.json", "host"=>"i360-core-ip-10-4-29-236", "host_type"=>"i360-core-dev", "type"=>"log4j-json", "i360-core-dev"=>{"message"=>{"description"=>"Package not found", "exception"=>"com.gilbarco.i360.core.error.ApplicationException: TASK_NOT_FOUND: TASK_NOT_FOUND - null", "user_agent"=>"Jodd HTTP", "code"=>"TASK_NOT_FOUND", "type"=>"checked", "url"=>"/passport-core/action/scheduled-task-v1/schedule-callback"}, "timestamp"=>"2017-01-30 04:29:00,180", "loggerName"=>"sys.web", "level"=>"WARN", "threadName"=>"2417383241"}}, "i360-core-dev"], "message"=>[{"@version"=>"1", "@timestamp"=>"2017-01-30T04:29:00.180Z", "file"=>"/var/log/tomcat7/gilbarco-passport-core.json", "host"=>"i360-core-ip-10-4-29-236", "host_type"=>"i360-core-dev", "type"=>"log4j-json", "i360-core-dev"=>{"message"=>{"description"=>"Package not found", "exception"=>"ore.error.ApplicationException: TASK_NOT_FOUND: TASK_NOT_FOUND - null", "user_agent"=>"Jodd HTTP", "code"=>"TASK_NOT_FOUND", "type"=>"checked", "url"=>"/passport-core/action/scheduled-task-v1/schedule-callback"}, "timestamp"=>"2017-01-30 04:29:00,180", "loggerName"=>"sys.web", "level"=>"WARN", "threadName"=>"2417383241"}}, "message"], **"[type]"=>**[{"@version"=>"1", "@timestamp"=>"2017-01-30T04:29:00.180Z", "file"=>"/var/log/tomcat7/gilbarco-passport-core.json", "host"=>"i360-core-ip-10-4-29-236", "host_type"=>"i360-core-dev", "type"=>"log4j-json", "i360-core-dev"=>{"message"=>{"description"=>"Package not found", "loggerName"=>"sys.web", "level"=>"WARN", "threadName"=>"2417383241"}}, "type"], "type"=>[{"@version"=>"1", "@timestamp"=>"2017-01-30T04:29:00.180Z", "file"=>"/var/log/tomcat7/gilbarco-passport-core.json", "host"=>"i360-core-ip-10-4-29-236", "host_type"=>"i360-core-dev", "type"=>"log4j-json", "i360-core-dev"=>{"message"=>{"description"=>"Package not found", "loggerName"=>"sys.web", "level"=>"WARN", "threadName"=>"2417383241"}}, "type"]}>>], :response=>{"create"=>{"_index"=>"logstash-dev-2017.01.30", "_type"=>"log4j-json", "_id"=>"AVnwIKrYH7RsVvf3n9Al", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [i360-core-dev.message]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"unknown property [description]"}}}}, :level=>:warn}

I can't look at the mappings because logstash has not accepted any messages to generate them with!

Ok, this is a clear misunderstanding on my part. I did not realize webserver.color and appserver.color needed to be the same type. Trying now with an index per app.