Parsing complex log

Hello Everyone :slight_smile: , I have the following log file

2017-06-13 13:00:01,494 - INFO [Line: 48]: Begin logging
2017-06-13 13:00:01,494 - DEBUG [Line: 89]: Writing to lockfile. Lockfile location: /etc/conf/the/back_up.txt
2017-06-13 13:00:03,521 - WARNING [Line: 449]: Snapshotting is not enabled
2017-06-13 13:06:15,663 - INFO [Line: 898]: stderr: 17/06/13 13:00:13 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=true,maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[/DAT/ABC], targetPath=/etc/conf/the/back_up.txt, tar getPathExists=true, preserveRawXattrs=false}
17/06/13 13:00:13 INFO impl.TimelineClientImpl: Timeline service address: http://ip-192-168-X-XX.xyz:9000/v1/example/
17/06/13 13:00:14 INFO tools.DistCp: DistCp job log path: /var/tar/xar
17/06/13 13:00:20 INFO mapreduce.JobSubmitter: number of splits:22
17/06/13 13:00:21 INFO impl.YarnClientImpl: Submitted application application_1495940390018_0989
17/06/13 13:00:21 INFO mapreduce.Job: Running job: job_1495940390018_0989
17/06/13 13:00:29 INFO mapreduce.Job: Job job_1495940390018_0989 running in uber mode : false
17/06/13 13:00:29 INFO mapreduce.Job: map 0% reduce 0%
17/06/13 13:00:46 INFO mapreduce.Job: map 11% reduce 0%
17/06/13 13:00:47 INFO mapreduce.Job: map 17% reduce 0%
17/06/13 13:00:48 INFO mapreduce.Job: map 18% reduce 0%
17/06/13 13:00:49 INFO mapreduce.Job: map 23% reduce 0%
17/06/13 13:00:50 INFO mapreduce.Job: map 28% reduce 0%
17/06/13 13:00:51 INFO mapreduce.Job: map 29% reduce 0%
17/06/13 13:00:52 INFO mapreduce.Job: map 32% reduce 0%
17/06/13 13:00:53 INFO mapreduce.Job: map 37% reduce 0%
17/06/13 13:00:54 INFO mapreduce.Job: map 38% reduce 0%
17/06/13 13:00:55 INFO mapreduce.Job: map 41% reduce 0%
17/06/13 13:00:56 INFO mapreduce.Job: map 44% reduce 0%
17/06/13 13:00:57 INFO mapreduce.Job: map 45% reduce 0%
17/06/13 13:00:58 INFO mapreduce.Job: map 47% reduce 0%
17/06/13 13:00:59 INFO mapreduce.Job: map 48% reduce 0%
17/06/13 13:01:00 INFO mapreduce.Job: map 49% reduce 0%
17/06/13 13:01:07 INFO mapreduce.Job: map 54% reduce 0%
17/06/13 13:01:08 INFO mapreduce.Job: map 57% reduce 0%
17/06/13 13:01:10 INFO mapreduce.Job: map 59% reduce 0%
17/06/13 13:01:11 INFO mapreduce.Job: map 60% reduce 0%
17/06/13 13:01:13 INFO mapreduce.Job: map 62% reduce 0%
17/06/13 13:01:14 INFO mapreduce.Job: map 63% reduce 0%
17/06/13 13:01:15 INFO mapreduce.Job: map 64% reduce 0%
17/06/13 13:01:16 INFO mapreduce.Job: map 65% reduce 0%
17/06/13 13:01:31 INFO mapreduce.Job: map 76% reduce 0%
17/06/13 13:01:35 INFO mapreduce.Job: map 77% reduce 0%
17/06/13 13:01:39 INFO mapreduce.Job: map 78% reduce 0%
17/06/13 13:01:44 INFO mapreduce.Job: map 79% reduce 0%
17/06/13 13:01:48 INFO mapreduce.Job: map 80% reduce 0%
17/06/13 13:01:52 INFO mapreduce.Job: map 81% reduce 0%
17/06/13 13:01:55 INFO mapreduce.Job: map 82% reduce 0%
17/06/13 13:01:58 INFO mapreduce.Job: map 83% reduce 0%
17/06/13 13:02:01 INFO mapreduce.Job: map 84% reduce 0%
17/06/13 13:02:06 INFO mapreduce.Job: map 85% reduce 0%
17/06/13 13:02:09 INFO mapreduce.Job: map 86% reduce 0%
17/06/13 13:02:12 INFO mapreduce.Job: map 87% reduce 0%
17/06/13 13:02:16 INFO mapreduce.Job: map 88% reduce 0%
17/06/13 13:02:18 INFO mapreduce.Job: map 89% reduce 0%
17/06/13 13:02:23 INFO mapreduce.Job: map 90% reduce 0%
17/06/13 13:02:28 INFO mapreduce.Job: map 91% reduce 0%
17/06/13 13:02:36 INFO mapreduce.Job: map 92% reduce 0%
17/06/13 13:02:42 INFO mapreduce.Job: map 93% reduce 0%
17/06/13 13:02:47 INFO mapreduce.Job: map 94% reduce 0%
17/06/13 13:02:51 INFO mapreduce.Job: map 95% reduce 0%
17/06/13 13:02:57 INFO mapreduce.Job: map 96% reduce 0%
17/06/13 13:03:04 INFO mapreduce.Job: map 97% reduce 0%
17/06/13 13:03:10 INFO mapreduce.Job: map 98% reduce 0%
17/06/13 13:03:30 INFO mapreduce.Job: map 99% reduce 0%
17/06/13 13:03:58 INFO mapreduce.Job: map 100% reduce 0%
17/06/13 13:06:15 INFO mapreduce.Job: Job job_1495940390018_0989 completed successfully
17/06/13 13:06:15 INFO mapreduce.Job: Counters: 33
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=30634
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1810172
HDFS: Number of bytes written=6602
HDFS: Number of read operations=21710
HDFS: Number of large read operations=0
HDFS: Number of write operations=4461
Job Counters
Launched map tasks=22
Other local map tasks=22
Total time spent by all maps in occupied slots (ms)=09878
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=170939
Total vcore-milliseconds taken by all map tasks=17049
Total megabyte-milliseconds taken by all map tasks=1747536
Map-Reduce Framework
Map input records=417
Map output records=175
Input split bytes=262
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=3338
CPU time spent (ms)=3180
Physical memory (bytes) snapshot=480768
Virtual memory (bytes) snapshot=61798624
Total committed heap usage (bytes)=2965728
File Input Format Counters
Bytes Read=17510
File Output Format Counters
Bytes Written=6616
org.apache.hadoop.tools.mapred.CopyMapper$Counter
BYTESSKIPPED=11361
COPY=1242
SKIP=3175
2017-06-13 13:06:15,668 - INFO [Line: 904]: Distcp -log output stored in /var/AB/CY/
2017-06-13 13:06:15,673 - INFO [Line: 132]: End logging
2017-06-13 13:07:01,494 - INFO [Line: 48]: Begin logging
. {similar to above logs}
. {similar to above logs}
.{similar to above logs}
2017-06-13 13:07:15,673 - INFO [Line: 132]: End logging
..
.
.
.
.
etc.

And then again it will have similar to the above part for different job IDs starting with "Begin Logging" and ends with "End Logging" as you see above in bold.

So my question here is: how can I parse that log in log-stash config, where I would like to see each block in one single ES log record. and what I mean by block is everything starting from "Begin Logging" till "End Logging"

  • note: I am still new to ES so any advice should be useful for me :slight_smile: thanks in advance!

how can I parse that log in log-stash config, where I would like to see each block in one single ES log record. and what I mean by block is everything starting from "Begin Logging" till "End Logging"

Why do you want that? How would you then filter on e.g. the loglevel, logger name, line number, or whatever?

Also, are you sure all lines that belong together are logged sequentially? In other words, is it guaranteed that no other log entries could sneak in between?

1 Like

Thanks Magnus for your reply,

  • regarding the first part of your question; the vase I have here as following:
    I will be having this as parameter in my ES query (targetPath=/etc/conf/the/back_up.tx)
    then I am only interested in finding the job ID for this target_path and check if this job_id is successfully done or failure happened!
    You can see those parts I am interested in up there job_ID "job_1495940390018_0989" , targetPath=/etc/conf/the/back_up.tx , job_1495940390018_0989 completed successfully ...

in other words the ES query is going to be ( e.g: query for job that has targetPath=/etc/conf/the/back_up.tx) and return this job/operation success/failure beside the timestamp.

  • Regarding your second question, no am not sure if this sequence is going to change, it may happen for example when failure happens "I might lose some logs" for e.g here ->
    17/06/13 13:01:10 INFO mapreduce.Job: map 59% reduce 0%
    17/06/13 13:01:11 INFO mapreduce.Job: map 60% reduce 0%
    17/06/13 13:01:13 INFO mapreduce.Job: map 62% reduce 0%
    17/06/13 13:01:14 INFO mapreduce.Job: map 63% reduce 0%
    17/06/13 13:01:15 INFO mapreduce.Job: map 64% reduce 0%
    it may stop at 64% and fail~

  • I understand it's not well structured log but I can't change a lot of this produced structure since it's automatically produced by Hadoop.
    Any suggestions? or work around you might thinking about?

Much Appreciation :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.