Grok + Dissect - how to work properly?

Hello, everyone.
I have a problem with running grok and dissect filtering at once. My config:

    grok {
      match => { 'message' => '\[%{TIMESTAMP_ISO8601:time_stamp}\] \[%{DATA:correlation_id}\] \[%{LOGLEVEL:log_level}\] \[%{DATA:source}\] \[%{DATA:uri_domain}\] \[%{DATA:someParameter1}\] \[%{DATA:someParameter2}\] %{GREEDYDATA:message}' }
    }
    dissect {
      mapping => {
        'message' => '[%{time_stamp}] [%{correlation_id}] [%{log_level}] [%{source}] [%{uri_domain}] [%{someParameter1}] [%{someParameter2}] %{message}'
      }
    }
    mutate { 
      add_field => { "[log_type]" => "services" }
    }   
    date {
      match => ["time_stamp", "ISO8601"]
      target => "@timestamp"
      remove_field => [ "time_stamp" ]
    }

And message i try to filter/dissect is:

"message": "[2020-03-28T13:40:46.487] [0HLUIAK4FP7PP:00000001] [Info] [Microsoft.AspNetCore.Hosting.Diagnostics] [http://7c2d2ef3d838/health] [Consul Health Check] [] Request finished in 1.7305ms 200 application/json ",

And after issect I had a strange result in Elastic:

    {
  "_index": "someindex",
  "_type": "_doc",
  "_id": "Fh9DN3EB-H6bOdc5ghAi",
  "_version": 1,
  "_score": null,
  "_source": {
    "someParameter1": "Consul Health Check",
    "container": {
      "image": {
        "name": "nexus.sibintek.ru:80/mes/sibintekmesdatacollectorhost:latest"
      },
      "id": "7a8252d247af509bd46e522b18c072145e433b5c5a06d53aa7a42b191349892d",
      "labels": {
        "com_docker_compose_service": "datacollector",
        "com_docker_compose_version": "1.25.4",
        "com_docker_compose_oneoff": "False",
        "com_docker_compose_config-hash": "edd50eece59244be62e51b03bcef017df9e00984f672f529a5c20be3e1883714",
        "com_docker_compose_project": "docker",
        "com_docker_compose_project_config_files": "Docker\\docker-compose.yml",
        "com_docker_compose_container-number": "1"
      },
      "name": "docker_datacollector_1"
    },
    "someParameter2": "",
    "ecs": {
      "version": "1.4.0"
    },
    "host": {
      "name": "bfdb0c038083"
    },
    "log_level": "Info",
    "log_type": "services",
    "tags": [
      "beats_input_codec_plain_applied",
      "_dateparsefailure"
    ],
    "uri_domain": "http://7a8252d247af/health",
    "input": {
      "type": "container"
    },
    "agent": {
      "ephemeral_id": "3f600c48-cfca-4aa5-a4f4-a3b2d9bd3576",
      "version": "7.6.1",
      "id": "ba03f4a2-9bf2-4f91-9a54-46cf4a9cf351",
      "hostname": "bfdb0c038083",
      "type": "filebeat"
    },
    "@timestamp": "2020-04-01T19:41:42.593Z",
    "@version": "1",
    "time_stamp": "\"[2020-04-01T22:41:42.593",
    "log": {
      "offset": 80400,
      "file": {
        "path": "/var/lib/docker/containers/7a8252d247af509bd46e522b18c072145e433b5c5a06d53aa7a42b191349892d/7a8252d247af509bd46e522b18c072145e433b5c5a06d53aa7a42b191349892d-json.log"
      }
    },
    "stream": "stdout",
    "correlation_id": "0HLUMEHK3AE8C:00000001",
    "source": "Microsoft.AspNetCore.Hosting.Diagnostics",
    "message": "Request finished in 1.7305ms 200 application/json \", \"Request finished in 1.7305ms 200 application/json \"]"
  },
  "fields": {
    "@timestamp": [
      "2020-04-01T19:41:42.593Z"
    ]
  },
  "sort": [
    1585770102593
  ]
}

Seems like it transform a message into array with original message and then dissect only first one. I fond it out when try use wrong mapping configuration for dissect:

[2020-04-01T19:47:24,447][WARN ][org.logstash.dissect.Dissector][main] Dissector mapping, pattern not found {"field"=>"message", "pattern"=>"\\[%{time_stamp}] [%{correlation_id}] [%{log_level}] [%{source}] [%{uri_domain}] [%{someParameter1}] [%{someParameter2}] %{message}", "event"=>{"message"=>["[2020-04-01T22:47:19.846] [de385773-167a-4fcf-970f-51719e48ece8] [Info] [Microsoft.AspNetCore.Hosting.Diagnostics] [http://datacollector/api/datacollector/Tag] [] [] Request finished in 11.4625ms 200 application/json; charset=utf-8 ", "Request finished in 11.4625ms 200 application/json; charset=utf-8 "], "stream"=>"stdout", "@timestamp"=>2020-04-01T19:47:19.847Z, "container"=>{"image"=>{"name"=>"somedomain:80/datacollectorhost:latest"}, "name"=>"docker_datacollector_1", "labels"=>{"com_docker_compose_project"=>"docker", "com_docker_compose_version"=>"1.25.4", "com_docker_compose_container-number"=>"1", "com_docker_compose_project_config_files"=>"Docker\\docker-compose.yml", "com_docker_compose_service"=>"datacollector", "com_docker_compose_oneoff"=>"False", "com_docker_compose_project_working_dir"=>"C:\\somedir\\Docker", "com_docker_compose_config-hash"=>"edd50eece59244be62e51b03bcef017df9e00984f672f529a5c20be3e1883714"}, "id"=>"7a8252d247af509bd46e522b18c072145e433b5c5a06d53aa7a42b191349892d"}, "agent"=>{"ephemeral_id"=>"3f600c48-cfca-4aa5-a4f4-a3b2d9bd3576", "hostname"=>"bfdb0c038083", "version"=>"7.6.1", "type"=>"filebeat", "id"=>"ba03f4a2-9bf2-4f91-9a54-46cf4a9cf351"}, "time_stamp"=>"2020-04-01T22:47:19.846", "input"=>{"type"=>"container"}, "@version"=>"1", "log_level"=>"Info", "correlation_id"=>"de385773-167a-4fcf-970f-51719e48ece8", "tags"=>["beats_input_codec_plain_applied", "_dissectfailure"], "host"=>{"name"=>"bfdb0c038083"}, "log"=>{"file"=>{"path"=>"/var/lib/docker/containers/7a8252d247af509bd46e522b18c072145e433b5c5a06d53aa7a42b191349892d/7a8252d247af509bd46e522b18c072145e433b5c5a06d53aa7a42b191349892d-json.log"}, "offset"=>285381}, "uri_domain"=>"http://datacollector/api/datacollector/Tag", "source"=>"Microsoft.AspNetCore.Hosting.Diagnostics", "ecs"=>{"version"=>"1.4.0"}}}

I tried use only grok or dissect, by failed by some reasons.

Can anyone help me find what wrong with my configuration, please?

That is exactly what I would expect. You have a [message] field. You use grok to parse with a pattern that ends in '%{GREEDYDATA:message}'. That will add a field called [message]. Since one already exists it is converted into an array. If you want to overwrite it then add

overwrite => [ "message" ]

to your grok filter.

For the dissect, I am not surprised to hear that if passed an array it only dissects the first entry.

Hello, Badger.
Thank you for your answer.

I changed my config way you pointed and it looks like:

grok {
  match => { 'message' => '\[%{TIMESTAMP_ISO8601:time_stamp}\] \[%{DATA:correlation_id}\] \[%{LOGLEVEL:log_level}\] \[%{DATA:source}\] \[%{DATA:uri_domain}\] \[%{DATA:someParameter1}\] \[%{DATA:someParameter2}\] %{GREEDYDATA:message}' }
  overwrite => [ "message" ]
}
mutate { 
  add_field => { "[log_type]" => "services" }
}   
date {
  match => ["time_stamp", "ISO8601"]
  target => "@timestamp"
  remove_field => [ "time_stamp" ]
}

And after that I see no data in Elastic:

And if I pass wrong configuration for dissect like dissect { mapping => { 'message' => '\%{message}' } }, I'll got this error in elk-logs:

[2020-04-02T07:45:57,126][WARN ][org.logstash.dissect.Dissector][main] Dissector mapping, pattern not found {"field"=>"message", "pattern"=>"\\%{message}", "event"=>{"@timestamp"=>2020-04-02T07:45:52.646Z, "time_stamp"=>"2020-04-02T10:45:52.645", "@version"=>"1", "tags"=>["beats_input_codec_plain_applied", "_dissectfailure"], "input"=>{"type"=>"container"}, "source"=>"Microsoft.AspNetCore.Hosting.Diagnostics", "log_level"=>"Info", "stream"=>"stdout", "ecs"=>{"version"=>"1.4.0"}, "uri_domain"=>"http://datacollector/api/datacollector/Tag", "correlation_id"=>"84fda1c9-c4a8-41a2-b873-5c0381a9d8a0", "message"=>"Request finished in 7.3067ms 200 application/json; charset=utf-8 ", "log"=>{"offset"=>167426, "file"=>{"path"=>"/var/lib/docker/containers/a919070336088d59897bb06d6fc290e23b3f6fb087ef661ef692eac43df74f78/a919070336088d59897bb06d6fc290e23b3f6fb087ef661ef692eac43df74f78-json.log"}}, "agent"=>{"hostname"=>"dc04add040f4", "type"=>"filebeat", "version"=>"7.6.1", "ephemeral_id"=>"fc1f5df8-0000-4473-8188-df2f74b427e1", "id"=>"fa5185c7-7950-4879-b9c7-6ec916f51547"}, "host"=>{"name"=>"dc04add040f4"}, "container"=>{"labels"=>{"com_docker_compose_project_config_files"=>"Docker\\docker-compose.yml", "com_docker_compose_project"=>"docker", "com_docker_compose_oneoff"=>"False", "com_docker_compose_config-hash"=>"edd50eece59244be62e51b03bcef017df9e00984f672f529a5c20be3e1883714", "com_docker_compose_service"=>"datacollector", "com_docker_compose_version"=>"1.25.4", "com_docker_compose_project_working_dir"=>"C:\\somepath\\Docker", "com_docker_compose_container-number"=>"1"}, "name"=>"docker_datacollector_1", "id"=>"a919070336088d59897bb06d6fc290e23b3f6fb087ef661ef692eac43df74f78", "image"=>{"name"=>"somehost:80/datacollectorhost:latest"}}}}

That means grok filtering data correctly, but without dissecting it I cant see them in Elastic for some reason.

And another point- It seems to me that remove_field in date didn't remove timestamp. You can see it in error log too.

Ok, I just use only dissect plugin to filter my log data like this:

dissect {
  mapping => {
	'message' => '[%{time_stamp}] [%{correlation_id}] [%{log_level}] [%{source}] [%{uri_domain}] [%{someParameter1}] [%{someParameter2}] %{message}'
  }
}
mutate { 
  add_field => { "[log_type]" => "services" }
}  

And a core of the problem was time_stamp - when I dont use this date filtering, I see all data in Kibana.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.