Issue with logsatsh

you can find here

your problem is that your data contains 2 formats of date

Found that the problem is my date format. I create a new file which only contains the date in json format. And I have a json parse failure
Here is the content of the file : {"creation_date": "2023-01-04T15:30:28"}

And here is my config file :
input {
file {
codec => "json"
path => "/testELK/exemple.json"
start_position => "beginning"
type => "json"
}
}
filter {
json {
source => "message"
remove_field => ["message"]
}
date {
match => ["creation_date", "yyyy-MM-dd'T'HH:mm:ss.SSS"]
}
}
output {
elasticsearch {
hosts => ["http://10.69.115.190:9200"]
index => "data1"
}
stdout {
codec => rubydebug
}
}

the part of

remove_field => ["message"]

must go into mutate not into json filter plugin

mutate {
  remove_field => [ "message" ]
}

it may be like this

input {
  file {
    codec => "json"
    path => "/testELK/exemple.json"
    start_position => "beginning"
    type => "json"
  }
}
filter {
  json {
    source => "message"
  }
  date {
    match => ["creation_date", "yyyy-MM-dd'T'HH:mm:ss.SSS"]
  }
  mutate {
    remove_field => [ "message" ]
  }
}
output {
  elasticsearch {
   hosts => ["http://10.69.115.190:9200"]
   index => "data1"
  }
  stdout {
   codec => rubydebug
  }
}

here when you are writin please use the simbols on top to format the text to see them bether

Hello, I am trying to parse this JSON with Logstash.

{"creation_date": "2023/01/04", "vulnerabilities": [{"count": 1, "score": null, "vuln_index": 414, "plugin_name": "WordPad History", "severity": 0, "vpr_score": null, "plugin_id": 92438, "severity_index": 0, "cpe": "cpe:/o:microsoft:windows", "offline": false, "plugin_family": "Windows", "snoozed": 0}, {"count": 1, "score": null, "vuln_index": 396, "plugin_name": "Windows Defender Installed", "severity": 0, "vpr_score": null, "plugin_id": 131023, "severity_index": 1, "cpe": "cpe:/a:microsoft:windows_defender", "offline": false, "plugin_family": "Windows", "snoozed": 0}, {"count": 1, "score": null, "vuln_index": 333, "plugin_name": "RDP Screenshot", "severity": 0, "vpr_score": null, "plugin_id": 66173, "severity_index": 2, "cpe": "cpe:/o:microsoft:windows", "offline": false, "plugin_family": "General", "snoozed": 0}]}

Everything works fine, but when I add another object to my JSON, it fails. I don't understand why. ( I add another object in vulnerabilities key).

I got this error :

[ERROR] 2023-04-25 14:40:13.055 [[main]<file] json - JSON parse error, original data now in message field {:message=>"Unrecognized token 'count': was expecting ('true', 'false' or 'null')\n at [Source: (String)\"count\": 1, \"score\": null, \"vuln_index\": 319, \"plugin_name\": \"NetBIOS Multiple IP Address Enumeration\", \"severity\": 0, \"vpr_score\": null, \"plugin_id\": 43815, \"severity_index\": 3, \"cpe\": null, \"offline\": false, \"plugin_family\": \"Windows\", \"snoozed\": 0}]}\"; line: 1, column: 6]", :exception=>LogStash::Json::ParserError, :data=>"count\": 1, \"score\": null, \"vuln_index\": 319, \"plugin_name\": \"NetBIOS Multiple IP Address Enumeration\", \"severity\": 0, \"vpr_score\": null, \"plugin_id\": 43815, \"severity_index\": 3, \"cpe\": null, \"offline\": false, \"plugin_family\": \"Windows\", \"snoozed\": 0}]}"}
[WARN ] 2023-04-25 14:40:13.206 [[main]>worker0] json - Error parsing json {:source=>"message", :raw=>"count\": 1, \"score\": null, \"vuln_index\": 319, \"plugin_name\": \"NetBIOS Multiple IP Address Enumeration\", \"severity\": 0, \"vpr_score\": null, \"plugin_id\": 43815, \"severity_index\": 3, \"cpe\": null, \"offline\": false, \"plugin_family\": \"Windows\", \"snoozed\": 0}]}", :exception=>#<LogStash::Json::ParserError: Unrecognized token 'count': was expecting ('true', 'false' or 'null')
 at [Source: (byte[])"count": 1, "score": null, "vuln_index": 319, "plugin_name": "NetBIOS Multiple IP Address Enumeration", "severity": 0, "vpr_score": null, "plugin_id": 43815, "severity_index": 3, "cpe": null, "offline": false, "plugin_family": "Windows", "snoozed": 0}]}"; line: 1, column: 7]>}
{
          "type" => "json",
          "path" => "/testELK/exemple.json",
      "@version" => "1",
          "host" => "lnessus02v",
    "@timestamp" => 2023-04-25T12:40:13.066Z,
          "tags" => [
        [0] "_jsonparsefailure"
    ]
}

That is not correct. When the common remove_field option is in the json filter the removal of the [message] field only occurs if the filter successfully parses the JSON. If it fails then the [message] filter is retained and you can examine it to see why it failed to parse.

1 Like

Hello can you help me about my issue?

The string that it is trying to parse starts with count", not {"creation_date":. That suggests that when you added another vulnerability you got the JSON structure wrong.

no the structure is not wrong. When I add another vulnerability my json look like this :

{"creation_date": "2023/01/04", "vulnerabilities": [{"count": 1, "score": null, "vuln_index": 414, "plugin_name": "WordPad History", "severity": 0, "vpr_score": null, "plugin_id": 92438, "severity_index": 0, "cpe": "cpe:/o:microsoft:windows", "offline": false, "plugin_family": "Windows", "snoozed": 0}, {"count": 1, "score": null, "vuln_index": 396, "plugin_name": "Windows Defender Installed", "severity": 0, "vpr_score": null, "plugin_id": 131023, "severity_index": 1, "cpe": "cpe:/a:microsoft:windows_defender", "offline": false, "plugin_family": "Windows", "snoozed": 0}, {"count": 1, "score": null, "vuln_index": 333, "plugin_name": "RDP Screenshot", "severity": 0, "vpr_score": null, "plugin_id": 66173, "severity_index": 2, "cpe": "cpe:/o:microsoft:windows", "offline": false, "plugin_family": "General", "snoozed": 0},{"count": 1, "score": null, "vuln_index": 319, "plugin_name": "NetBIOS Multiple IP Address Enumeration", "severity": 0, "vpr_score": null, "plugin_id": 43815, "severity_index": 3, "cpe": null, "offline": false, "plugin_family": "Windows", "snoozed": 0}]}

A json filter will parse that producing an array of four vulnerabilities ending in

    [3] {
            "vuln_index" => 319,
                 "count" => 1,
                 "score" => nil,
        "severity_index" => 3,
                   "cpe" => nil,
             "plugin_id" => 43815,
           "plugin_name" => "NetBIOS Multiple IP Address Enumeration",
         "plugin_family" => "Windows",
               "offline" => false,
             "vpr_score" => nil,
               "snoozed" => 0,
              "severity" => 0
    }

I do not think your data looks the way you think it looks.

here is my data :

{"creation_date": "2023/01/04", "vulnerabilities": [{"count": 1, "score": null, "vuln_index": 414, "plugin_name": "WordPad History", "severity": 0, "vpr_score": null, "plugin_id": 92438, "severity_index": 0, "cpe": "cpe:/o:microsoft:windows", "offline": false, "plugin_family": "Windows", "snoozed": 0}, {"count": 1, "score": null, "vuln_index": 396, "plugin_name": "Windows Defender Installed", "severity": 0, "vpr_score": null, "plugin_id": 131023, "severity_index": 1, "cpe": "cpe:/a:microsoft:windows_defender", "offline": false, "plugin_family": "Windows", "snoozed": 0}, {"count": 1, "score": null, "vuln_index": 333, "plugin_name": "RDP Screenshot", "severity": 0, "vpr_score": null, "plugin_id": 66173, "severity_index": 2, "cpe": "cpe:/o:microsoft:windows", "offline": false, "plugin_family": "General", "snoozed": 0},{"count": 1, "score": null, "vuln_index": 319, "plugin_name": "NetBIOS Multiple IP Address Enumeration", "severity": 0, "vpr_score": null, "plugin_id": 43815, "severity_index": 3, "cpe": null, "offline": false, "plugin_family": "Windows", "snoozed": 0}]}

here is my confif file :

input {
   file {
    path => "/testELK/exemple.json"
    codec => "json"
    start_position => "beginning"
    type => "json"
  }
}

filter {
    json {
     source => "message"
   }
   mutate {
    remove_field => [ "message" ]
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "data7"
  }
  stdout {
    codec => rubydebug
  }
}

If you are using a json codec then there is no reason to use a json filter. When you use a json codec there will not be a field called [message] unless parsing the JSON fails. If it fails in the codec then [message] will get created and the json filter will get the same parse failure as the codec.

I suggest you remove the json and mutate filters and see what the [message] field looks like when it gets to elasticsearch. I am sure it will not look the way you expect.

When I remove the filter I got this

[ERROR] 2023-04-25 17:05:59.794 [[main]<file] json - JSON parse error, original data now in message field {:message=>"Unrecognized token 'count': was expecting ('true', 'false' or 'null')\n at [Source: (String)\"count\": 1, \"score\": null, \"vuln_index\": 300, \"plugin_name\": \"Microsoft Windows SMB Registry : Winlogon Cached Password Weakness\", \"severity\": 0, \"vpr_score\": null, \"plugin_id\": 11457, \"severity_index\": 4, \"cpe\": null, \"offline\": false, \"plugin_family\": \"Windows\", \"snoozed\": 0}]}\"; line: 1, column: 6]", :exception=>LogStash::Json::ParserError, :data=>"count\": 1, \"score\": null, \"vuln_index\": 300, \"plugin_name\": \"Microsoft Windows SMB Registry : Winlogon Cached Password Weakness\", \"severity\": 0, \"vpr_score\": null, \"plugin_id\": 11457, \"severity_index\": 4, \"cpe\": null, \"offline\": false, \"plugin_family\": \"Windows\", \"snoozed\": 0}]}"}
{
          "type" => "json",
          "tags" => [
        [0] "_jsonparsefailure"
    ],
      "@version" => "1",
          "host" => "******",
       "message" => "count\": 1, \"score\": null, \"vuln_index\": 300, \"plugin_name\": \"Microsoft Windows SMB Registry : Winlogon Cached Password Weakness\", \"severity\": 0, \"vpr_score\": null, \"plugin_id\": 11457, \"severity_index\": 4, \"cpe\": null, \"offline\": false, \"plugin_family\": \"Windows\", \"snoozed\": 0}]}",
    "@timestamp" => 2023-04-25T15:05:59.802Z,
          "path" => "/testELK/exemple.json"
}

That [[main]<file] error you were getting before, it is coming from the codec in the file input. The
[[main]>worker0] error that was coming from the json filter no longer occurs. That's all expected. What do you see in elasticsearch?

It just occurred to me what the problem might be. When logstash first sees the file it starts reading at the beginning. It remembers how far it has gotten in the sincedb. This is persisted across restarts. start_position => "beginning" only has any effect the first time the file input sees the file.

Suppose you have a file

{ "foo": "bar" }

and logstash reads all 15 bytes. If you edit the file to contain

{ "foo": "bar", "hello", "world" }

then logstash will read "hello", "world" } and the json code will produce an error very similar to the one you are seeing.

Try adding sincedb_path => "/dev/null" to your file input and see if a pipeline restart correctly decodes the JSON.

In elastic I saw this in mapping

{
  "mappings": {
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "@version": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "host": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "message": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "path": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "tags": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "type": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      }
    }
  }
}

Hi, thank you for unblocking me. I have a question: when my index is created in Elasticsearch, all the data is put together. For example, for the index field, all the indexes are grouped together when they should be separated. We should have multiple lines of data in Elasticsearch, but instead we have only one. And in the fields, each field contains all the data. Here's an example of the index field so you can see what I mean.
image

I only have on line of data :

How can I fix this, please?

Do you mean that they are different sets of data, or is it all expressed in a single JSON object? Because Elasticsearch stores the data in JSON format, so for the field

vulnerabilities.vuln_[414,396,....]

what it does is store everything as an array of data from the same field, if I'm not mistaken it does so because it considers that it is the same field only that it contains N different values, hence you see a field with N values in it.

hello, I solved my problem. Thank you for your help. But I have a question how can I do if I want to send many json file in Elasticsearch?

Hi, all the json files are located in the same folder?

If they are in the same folder you mai config the input to read all .json files like this

path =>  "/testELK/*.json"

It solves your problem or still having?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.