Filebeat send a json log but is stored as a string


(Dj Vidov) #1

Hello,

I have logs from a file, like that:
{"LogLevel":"INFO","StartTime":"15:01:30.625528","ExecutionTime":1,"CallingMethod":"GetHistoryComments"}
{"LogLevel":"INFO","StartTime":"15:01:25.5745528","ExecutionTime":0,"CallingMethod":"InitApp"}

I use filebeat 1.0 to send these logs to elastiscsearch and instead to have this json object into elasticsearch is stored into "message" field as string, like that:

{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1272,
"max_score": 1,
"hits": [
{
"_index": "dev-2016.01.05",
"_type": "log",
"_id": "AVISEHLA9KG7wxjkeWpi",
"_score": 1,
"_source": {
"@timestamp": "2016-01-05T13:53:29.648Z",
"beat": {
"hostname": "DEV01",
"name": "DEV01"
},
"count": 1,
"fields": null,
"input_type": "log",
"message": " {"LogLevel":"INFO","StartTime":"15:53:29.1970996","ExecutionTime":0,"CallingMethod":"InvokeMethod"},
"offset": 50444441,
"source": "D:\Logs\Pl_20160105-15.log",
"type": "log"
}
},....

Any ideea how to make it json?

Thank you!
Ovidiu


(Magnus Bäck) #2

Set codec => "json" for your beats input.


(Dj Vidov) #3

Thanks for answer.
where is codec setting?


(Magnus Bäck) #4
input {
  beats {
    ...
    codec => "json"
  }
}

(Steffen Siering) #5

filebeat is forwarding lines only, not parsing json into any kind of document. This functionality is provided by logstash codec or filters.


(Steffen Siering) #6

there's some discussion on github about supporting direct indexing of json logs


(Dj Vidov) #7

Thank you for answer. I agree with you, filebeat is a forwarding, and I read also discussion from github, but as first expectation was, i have in file json, I will have in elasticsearch json, but in the end a line from a file it will remain a string.


(Dj Vidov) #8

Hello,

I start to write json log partser for logstash, which looks like:

input {
		stdin { 
			codec => "json"
		}
		beats {
		codec => "json"
		port => 9202
	}
}
filter { }

output {
	stdout { codec => rubydebug }
	elasticsearch { 
		hosts => ["localhost:9200"] 
		user => "admin1"
		password => "admin1"
	}
}

and it really works for a tiny json.

But for a real json, like this:

{
	"LogLevel": "INFO",
	"StartTime": "2016-01-05 14:39:40.2012272",
	"ExecutionTime": 0,
	"CallingMethod": "SetPars",
	"Correlation": {
		"CorrelationId": "60e9d53e-e936-4e87-a47f-e009d4e3f3fb",
		"SequenceNo": 1
	},
	"Inputs": ["39b80f66-8547-4a3e-b9e1-e69a2d56329b",
	[{
		"Value": "IBr",
		"Name": "BrName",
		"Status": 0
	},
	{
		"Value": 101,
		"Name": "BrCode",
		"Status": 0
	}]]
}

even if this is a vaild json ( I have test it on: http://jsonlint.com/) it return me this huge error:

{
         "LogLevel" => "INFO",
        "StartTime" => "2016-01-05 14:39:40.2012272",
    "ExecutionTime" => 0,
    "CallingMethod" => "SetPars",
      "Correlation" => {
        "CorrelationId" => "60e9d53e-e936-4e87-a47f-e009d4e3f3fb",
           "SequenceNo" => 1
    },
           "Inputs" => [
        [0] "39b80f66-8547-4a3e-b9e1-e69a2d56329b",
        [1] [
            [0] {
                 "Value" => "IBr",
                  "Name" => "BrName",
                "Status" => 0
            },
            [1] {
                 "Value" => 101,
                  "Name" => "BrCode",
                "Status" => 0
            }
        ]
    ],
         "@version" => "1",
       "@timestamp" => "2016-01-06T09:35:15.001Z",
             "host" => "DV01LPT09"
}

←[33mFailed action.  {:status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-2016.01.06", :_type=>"logs", :_routing=>nil}, 
#<LogStash::Event:0x1a5364ea @metadata_accessors=#<LogStash::Util::Accessors:0x3fff4430 @store={}, @lut={}>,
 @cancelled=false, @data={"LogLevel"=>"INFO", "StartTime"=>"2016-01-05 14:39:40.2012272", "ExecutionTime"=>0, "CallingMethod"=>"SetPars", "Correlation"=>{"CorrelationId"=>"60e9d53e-e936-4e87-a47f-e009d4e3f3fb", "SequenceNo"=>1}, "Inputs"=>["39b80f66-8547-4a3e-b9e1-e69a2d56329b", [{"Value"=>"IBr", "Name"=>"BrName", "Status"=>0}, {"Value"=>101, "Name"=>"BrCode", "Status"=>0}]], "@version"=>"1", "@timestamp"=>"2016-01-06T09:35:15.001Z", "host"=>"DV01LPT09"}, 
@metadata={}, @accessors=#<LogStash::Util::Accessors:0x6f4b1f0e @store={"LogLevel"=>"INFO", "StartTime"=>"2016-01-05 14:39:40.2012272", "ExecutionTime"=>0, "CallingMethod"=>"SetPars", "Correlation"=>{"CorrelationId"=>"60e9d53e-e936-4e87-a47f-e009d4e3f3fb", "SequenceNo"=>1}, "Inputs"=>["39b80f66-8547-4a3e-b9e1-e69a2d56329b", [{"Value"=>"IBr", "Name"=>"BrName", "Status"=>0}, {"Value"=>101, "Name"=>"BrCode", "Status"=>0}]], "@version"=>"1", "@timestamp"=>"2016-01-06T09:35:15.001Z", "host"=>"DV01LPT09"},
@lut={"host"=>[{"LogLevel"=>"INFO", "StartTime"=>"2016-01-05 14:39:40.2012272","ExecutionTime"=>0, "CallingMethod"=>"SetPars", "Correlation"=>{"CorrelationId"=>"60e9d53e-e936-4e87-a47f-e009d4e3f3fb", "SequenceNo"=>1}, "Inputs"=>["39b80f66-8547-4a3e-b9e1-e69a2d56329b", [{"Value"=>"IBr", "Name"=>"BrName", "Status"=>0}, {"Value"=>101, "Name"=>"BrCode", "Status"=>0}]], "@version"=>"1", "@timestamp"=>"2016-01-06T09:35:15.001Z", "host"=>"DV01LPT09"}, "host"],
 "type"=>[{"LogLevel"=>"INFO", "StartTime"=>"2016-01-05 14:39:40.2012272", "ExecutionTime"=>0, "CallingMethod"=>"SetPars", "Correlation"=>{"CorrelationId"=>"60e9d53e-e936-4e87-a47f-e009d4e3f3fb", "SequenceNo"=>1}, "Inputs"=>["39b80f66-8547-4a3e-b9e1-e69a2d56329b", [{"Value"=>"IBr", "Name"=>"BrName", "Status"=>0}, {"Value"=>101, "Name"=>"BrCode", "Status"=>0}]], "@version"=>"1", "@timestamp"=>"2016-01-06T09:35:15.001Z", "host"=>"DV01LPT09"}, "type"]}>>],
 :response=>{"create"=>{"_index"=>"logstash-2016.01.06", "_type"=>"logs", "_id"=>"AVIWSlGY42d1jACX8Te2", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"Merging dynamic updates triggered a conflict: mapper [Inputs] of different type, current_type [string], merged_type [ObjectMapper]"}}}, :level=>:warn}←[0m

from what i understand the issue came from Inputs object, which is an generic array:

"Inputs": ["39b80f66-8547-4a3e-b9e1-e69a2d56329b",
	[{
		"Value": "IBr",
		"Name": "BrName",
		"Status": 0
	},
	{
		"Value": 101,
		"Name": "BrCode",
		"Status": 0
	}]]

But I have no ideea how to fix it.

Do you have any ideea?

Thanks,
Ovidiu
P.S. I really like this comments framework, is an open source one? :slight_smile:


(Magnus Bäck) #9

Here's the error:

"error"=>{"type"=>"mapper_parsing_exception", "reason"=>"Merging dynamic updates triggered a conflict: mapper [Inputs] of different type, current_type [string], merged_type [ObjectMapper]"}

This means that the Inputs field has been mapped as a string in the destination index but you're suddenly trying to turn the field into an object. Once set mappings are fixed for an index. If you're just playing around, perhaps you can just drop the index and start over?


(Dj Vidov) #10

Thank you Magnus for answer!

I saw the error, but from my point of view Inputs is an object.
I have delete the index, and when I have try it again with the same message I receive the same error:

"reason"=>"Merging dynamic updates triggered a conflict: mapper [Inputs] of different type, current_type [string], merged_type [ObjectMapper]"}}}, :level=>:warn}←[0m

There is a way to mark Inputs as an object?

Thanks,
Ovidiu


(Magnus Bäck) #11

You can explicitly set the mappings for the index, but since ES by default sets the mapping of a field based on the the first document containing the field it seems that you're first creating a document where the field is a string (creating a string mapping for the field) and afterwards try to create the document where the field is an object.


(Dj Vidov) #12

I'm confused.

At this moment the only sure is I didn't create the index with Input field as string because Input field in my code is always an array of params.

  1. There is a documentations how to set Inputs field as object instead of string?
  2. Why is necessary to map a message if elasticsearch is based on a document database? I cannot save any type of data into an index?

thank you!
Ovidiu


(Steffen Siering) #13

When indexing into elasticsearch you have to normalize your data to have a common format. It would be easiest, if the application producing the log has a 'common' scheme for it's output. Alternatively you can try to create a workaround by renaming the 'Inputs' field depending on the type. E.g. if 'Inputs' is a string, rename 'Inputs' to 'message'. You can also try to json encode the 'Input' field, so it's always a string.


(Dj Vidov) #14

Hi Steffens,

I agree with you about normalization, and at this moment I have all the messages normalized because right now all the messages are saved into a RDBMS, but now I try to put them into elasticsearch in order to improve search. But I don't understand why Inputs is mapped as string even if it is an array object.
I found mapping for my index, which looks like that:

{
  "logstash-2016.01.06": {
    "aliases": {},
    "mappings": {
      "logs": {
        "_all": {
          "enabled": true,
          "omit_norms": true
        },
        "dynamic_templates": [
          {
            "message_field": {
              "mapping": {
                "fielddata": {
                  "format": "disabled"
                },
                "index": "analyzed",
                "omit_norms": true,
                "type": "string"
              },
              "match": "message",
              "match_mapping_type": "string"
            }
          },
          {
            "string_fields": {
              "mapping": {
                "fielddata": {
                  "format": "disabled"
                },
                "index": "analyzed",
                "omit_norms": true,
                "type": "string",
                "fields": {
                  "raw": {
                    "ignore_above": 256,
                    "index": "not_analyzed",
                    "type": "string",
                    "doc_values": true
                  }
                }
              },
              "match": "*",
              "match_mapping_type": "string"
            }
          },

And also I found this article, which I hope it will helps me to change Inputs from string to object: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html

Regards,
Ovidiu


(Steffen Siering) #15

How is the mapping generated?

You might also want to check out index templates. Some example using indices templates with mappings in packetbeat and topbeat.


(Dj Vidov) #16

I don't generate any map, today I discover that I can set this map before I insert some data into index.
The map is generated at fist insert. :slight_smile:

I'll check your links.

Thank you!
Ovidiu


(Dj Vidov) #17

Hello,

I had finished the task. instead to change the logstash conf I had changed the generated json from log file.

Thanks for your support.
Ovidiu


(system) #18