Parsing log with logstash

How would you parse this key value data ? ( // this log data is in a single line)

myapp.myproject.notice.student.request-time = 2019-12-13 12:37:01.4 # myapp.myproject.notice.student.response-time = 2019-12-13 12:37:19.276

I want to parse fields , myapp.myproject.notice.student.request-time and myapp.myproject.notice.student.response-time

I tried this to one of the field

logstash.conf

filter {
kv {
source => "message"
include_keys => ["myapp.myproject.notice.student.request-time"]
target => "kv"
}

 date {
 match => [ "myapp.myproject.notice.student.request-time", "yyyy-MM- 
 dd
 HH:mm:ss.SSS", "yyyy-MM-dd HH:mm:ss.SSS Z", "MMM dd, yyyy 
 HH:mm:ss" ]
 timezone => "UTC"
 }
 }

Issue is I dont get time component ( I get zero for time) in the Date field in the Kibana output. I get , myapp.myproject.notice.student.request-time = Dec 13 , 2019 @ 00:00:00.000 at Kibana

How to fix the time component ?

Hi

You should probably add a field_split parameter to your kv filter (https://www.elastic.co/guide/en/logstash/current/plugins-filters-kv.html#plugins-filters-kv-field_split)

If you use a stdout{} output, what do you get?

Hope this helps

If I add field_split => "#" parameter in the KV filter

I get this error at logstash startup:

[main] Exception while parsing KV {:exception=>"Invalid FieldReference: 2019-12-18 11:48:58.700 INFO 27794 --- [http-nio-9003-exec-1] c.s.s.i.c.Controller : "myapp.myproject.notice.student.request-time "}

do you have to use kv filter

mutate {
split => ["message", "="]
add_field => { "field1" => "%{[message][0]}" }
add_field => { "field2" => "%{[message][1]}" }
}

and now you have two new field

okay ....I tried with mutate filter but field and values are not printing in stdout when I start logstash ....it does not work

see this

[2019-12-18T15:14:28,130][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.4.0"}
[2019-12-18T15:14:30,422][INFO ][org.reflections.Reflections] Reflections took 46 ms to scan 1 urls, producing 20 keys and 40 values
[2019-12-18T15:14:31,961][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been create for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2019-12-18T15:14:31,967][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, :thread=>"#<Thread:0x194809b3 run>"}
[2019-12-18T15:14:32,277][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2019-12-18T15:14:32,387][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-12-18T15:14:32,406][INFO ][filewatch.observingtail  ][main] START, creating Discoverer, Watch with file and sincedb collections
[2019-12-18T15:14:32,796][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9601}

show us your config file, filter option

Here is it is

input {
  file {
    path => "/opt/codedeploy/myapps.log"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {

       mutate {
       split => ["message","="]
       add_field => {"myapp.myproject.notice.student.request-time" =>"%{[message][0]}"}
       add_field => {"myapp.myproject.notice.student.response-time" =>"%{[message][1]}"}

       }
 }

output {
  stdout { codec => rubydebug }
}

no this is wrong. I just read your initial message. whole thing is one line.
so you should be splitting your message in two BY # and then by =

do this

filter {

       mutate {
       split => ["message","#"]
   -- this will divide your line in to two part, before after = and that will be in 
-- [message][0] and message[1]

       add_field => {"part1" =>"%{[message][0]}"}
       add_field => {"part2" =>"%{[message][1]}"}

--now split again
    mutate {
      split => ["part1", "="]
     }
       }
 }

Check the output and you should have everything seperated out

Ok ...I tried that.

This is new config file now

It did not work also

input {
  file {
    path => "/opt/codedeploy/myapps.log"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {

        mutate {
       split => ["message","#"]
       add_field => {"part1" =>"%{[message][0]}"}
       add_field => {"part2" =>"%{[message][1]}"}

       }

       mutate {
       split => ["part1","="]
       add_field => {"partA" =>"%{[message][0]}"}
       add_field => {"partB" =>"%{[message][1]}"}
       }

       mutate {
       split => ["part2","="]
       add_field => {"partC" =>"%{[message][0]}"}
       add_field => {"partD" =>"%{[message][1]}"}
       }


 }

output {
  stdout { codec => rubydebug }
}

I restarted logstash , it did not print key , values

see this
[2019-12-18T15:47:57,523][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.4.0"}
[2019-12-18T15:47:59,944][INFO ][org.reflections.Reflections] Reflections took 45 ms to scan 1 urls, producing 20 keys and 40 values
[2019-12-18T15:48:01,590][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been create for key: cluster_uuids. This may result in invalid serialization. It is recommended to log an issue to the responsible developer/development team.
[2019-12-18T15:48:01,597][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, :thread=>"#<Thread:0x555ef8da run>"}
[2019-12-18T15:48:01,959][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"}
[2019-12-18T15:48:02,058][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>}
[2019-12-18T15:48:02,061][INFO ][filewatch.observingtail ][main] START, creating Discoverer, Watch with file and sincedb collections
[2019-12-18T15:48:02,443][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9601}

filter {

    mutate {
   split => ["message","#"]
   add_field => {"part1" =>"%{[message][0]}"}
   add_field => {"part2" =>"%{[message][1]}"}

   }

   mutate {
   split => ["part1","="]
   add_field => {"partA" =>"%{[part1][0]}"}
   add_field => {"partB" =>"%{[part1][1]}"}
   }

   mutate {
   split => ["part2","="]
   add_field => {"partC" =>"%{[part2][0]}"}
   add_field => {"partD" =>"%{[part2][1]}"}
   }

}

Still did not work.

updated logstash.conf

input {
  file {
    path => "/opt/codedeploy/myapps.log"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {

       mutate {
       split => ["message","#"]
       add_field => {"part1" =>"%{[message][0]}"}
       add_field => {"part2" =>"%{[message][1]}"}

       }

       mutate {
       split => ["part1","="]
       add_field => {"partA" =>"%{[part1][0]}"}
       add_field => {"partB" =>"%{[part1][1]}"}
       }

       mutate {
       split => ["part2","="]
       add_field => {"partC" =>"%{[part2][0]}"}
       add_field => {"partD" =>"%{[part2][1]}"}
       }



 }

output {
  stdout { codec => rubydebug }
}

log

[2019-12-18T15:55:11,449][INFO ][logstash.runner ] Logstash shut down.
[2019-12-18T15:55:37,533][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.4.0"}
[2019-12-18T15:55:39,958][INFO ][org.reflections.Reflections] Reflections took 44 ms to scan 1 urls, producing 20 keys and 40 values
[2019-12-18T15:55:41,574][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been create for key: cluster_uuids. This may result in invalid serialization. It is recommended to log an issue to the responsible developer/development team.
[2019-12-18T15:55:41,580][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, :thread=>"#<Thread:0x19a19969 run>"}
[2019-12-18T15:55:41,862][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"}
[2019-12-18T15:55:41,980][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>}
[2019-12-18T15:55:42,009][INFO ][filewatch.observingtail ][main] START, creating Discoverer, Watch with file and sincedb collections
[2019-12-18T15:55:42,356][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9601}

I run it on my test system, using standard input and it works


echo "myapp.myproject.notice.student.request-time = 2019-12-13 12:37:01.4 # myapp.myproject.notice.student.response-time = 2019-12-13 12:37:19.276" | /usr/share/logstash/bin/logstash -f stdin-input-test.conf

{
         "part2" => [
        [0] " myapp.myproject.notice.student.response-time ",
        [1] " 2019-12-13 12:37:19.276"
    ],
         "partD" => " 2019-12-13 12:37:19.276",
         "partA" => "myapp.myproject.notice.student.request-time ",
       "message" => [
        [0] "myapp.myproject.notice.student.request-time = 2019-12-13 12:37:01.4 ",
        [1] " myapp.myproject.notice.student.response-time = 2019-12-13 12:37:19.276"
    ],
    "@timestamp" => 2019-12-18T15:58:04.660Z,
         "partB" => " 2019-12-13 12:37:01.4 ",
          "host" => "houelktst01",
      "@version" => "1",
         "part1" => [
        [0] "myapp.myproject.notice.student.request-time ",
        [1] " 2019-12-13 12:37:01.4 "
    ],
         "partC" => " myapp.myproject.notice.student.response-time "
}

This is config file

input { stdin { } }

filter {

       mutate {
       split => ["message","#"]
       add_field => {"part1" =>"%{[message][0]}"}
       add_field => {"part2" =>"%{[message][1]}"}

       }

       mutate {
       split => ["part1","="]
       add_field => {"partA" =>"%{[part1][0]}"}
       add_field => {"partB" =>"%{[part1][1]}"}
       }

       mutate {
       split => ["part2","="]
       add_field => {"partC" =>"%{[part2][0]}"}
       add_field => {"partD" =>"%{[part2][1]}"}
       }
}
output { stdout { codec => rubydebug } }

Test it if that works then something is wrong with your input. may be it is not reading it because it has already read that file once.

I was trying with your logstash.conf file and test from stdin

But it is not running also

Here it is

[root@ip-xx-x-x-xx logstash]# echo "myapp.myproject.notice.student.request-time = 2019-12-13 12:37:01.4 # myapp.myproject.notice.student.response-time = 2019-12-13 12:37:19.276" | /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf
Thread.exclusive is deprecated, use Thread::Mutex
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[WARN ] 2019-12-18 16:43:38.388 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[FATAL] 2019-12-18 16:43:38.408 [LogStash::Runner] runner - Logstash could not be started because there is already another instance using the configured data directory. If you wish to run multiple instances, you must change the "path.data" setting.
[ERROR] 2019-12-18 16:43:38.421 [LogStash::Runner] Logstash - java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit
[root@ip-xx-x-x-xx ]#

Is there any way I can test from stdin ?

oh well it says right there. you can't have two logstash process running on a system

you can either run this on another node? or stop logstash daemon and test this

logstash is daemon. just like any other process in linux it has to be only one running. sendmail/init/network/elasticsearch/kibana etc..

How to stop it ?

I have already executed initctl stop logstash

Do I need a different command to stop logstash daemon ?

show me output of . many flover of linux has different approch to stop it.
is this test machine?

ps -ef |grep logstash.

[root@ip-xx-x-x-xx logstash]# ps -ef |grep logstash.
root 4501 6526 0 17:20 pts/3 00:00:00 grep --color=auto logstash.
root 19164 7475 0 Nov25 ? 04:55:26 /usr/lib/jvm/jre/bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djruby.compile.invokedynamic=true -Djruby.jit.threshold=0 -Djruby.regexp.interruptible=true -XX:+HeapDumpOnOutOfMemoryError -Djava.security.egd=file:/dev/urandom -Dlog4j2.isThreadContextMapInheritable=true -cp /usr/share/logstash/logstash-core/lib/jars/animal-sniffer-annotations-1.14.jar:/usr/share/logstash/logstash-core/lib/jars/commons-codec-1.11.jar:/usr/share/logstash/logstash-core/lib/jars/commons-compiler-3.0.11.jar:/usr/share/logstash/logstash-core/lib/jars/error_prone_annotations-2.0.18.jar:/usr/share/logstash/logstash-core/lib/jars/google-java-format-1.1.jar:/usr/share/logstash/logstash-core/lib/jars/gradle-license-report-0.7.1.jar:/usr/share/logstash/logstash-core/lib/jars/guava-22.0.jar:/usr/share/logstash/logstash-core/lib/jars/j2objc-annotations-1.1.jar:/usr/share/logstash/logstash-core/lib/jars/jackson-annotations-2.9.9.jar:/usr/share/logstash/logstash-core/lib/jars/jackson-core-2.9.9.jar:/usr/share/logstash/logstash-core/lib/jars/jackson-databind-2.9.9.3.jar:/usr/share/logstash/logstash-core/lib/jars/jackson-dataformat-cbor-2.9.9.jar:/usr/share/logstash/logstash-core/lib/jars/janino-3.0.11.jar:/usr/share/logstash/logstash-core/lib/jars/javassist-3.24.0-GA.jar:/usr/share/logstash/logstash-core/lib/jars/jruby-complete-9.2.8.0.jar:/usr/share/logstash/logstash-core/lib/jars/jsr305-1.3.9.jar:/usr/share/logstash/logstash-core/lib/jars/log4j-api-2.11.1.jar:/usr/share/logstash/logstash-core/lib/jars/log4j-core-2.11.1.jar:/usr/share/logstash/logstash-core/lib/jars/log4j-slf4j-impl-2.11.1.jar:/usr/share/logstash/logstash-core/lib/jars/logstash-core.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.core.commands-3.6.0.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.core.contenttype-3.4.100.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.core.expressions-3.4.300.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.core.filesystem-1.3.100.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.core.jobs-3.5.100.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.core.resources-3.7.100.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.core.runtime-3.7.0.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.equinox.app-1.3.100.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.equinox.common-3.6.0.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.equinox.preferences-3.4.1.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.equinox.registry-3.5.101.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.jdt.core-3.10.0.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.osgi-3.7.1.jar:/usr/share/logstash/logstash-core/lib/jars/org.eclipse.text-3.5.101.jar:/usr/share/logstash/logstash-core/lib/jars/reflections-0.9.11.jar:/usr/share/logstash/logstash core/lib/jars/slf4j-api-1.7.25.jar org.logstash.Logstash -f /etc/logstash/conf.d/logstash.conf --debug

I start and stop by this. This works with me ....I am using amazon linux2
#initctl start logstash
#initctl stop logstash

you can find out if it is stop by issuing command
ps -ef |grep logstash

and it says it is running. if you really want to kill it

you can issue
kill -9 19164 7475

but it will stop all ingestion. make sure if you don't have anything running. use your judgment.

okay..I killed the process and started logstash again with stdin now

I get this

{
         "part1" => [
        [0] "myapp.myproject.notice.student.request-time ",
        [1] " 2019-12-13 12:37:01.4 "
    ],
         "partA" => "myapp.myproject.notice.student.request-time ",
         "part2" => [
        [0] " myapp.myproject.notice.student.response-time ",
        [1] " 2019-12-13 12:37:19.276"
    ],
       "message" => [
        [0] "myapp.myproject.notice.student.request-time = 2019-12-13 12:37:01.4 ",
        [1] " myapp.myproject.notice.student.response-time = 2019-12-13 12:37:19.276"
    ],
          "host" => "ip-xx-0-0-xx",
      "@version" => "1",
    "@timestamp" => 2019-12-19T04:17:30.153Z,
         "partB" => " 2019-12-13 12:37:01.4 ",
         "partC" => " myapp.myproject.notice.student.response-time ",
         "partD" => " 2019-12-13 12:37:19.276"
}

There is a issues here

I see fields and values got blank spaces at the front and back . Can we trim these whitespaces ? how ?