How to get rid of nested field and other issues

Hallo, I 'm using logstash 7.11 and my data have the following format:

    { field1: ...,
     field2: ...,
     aList: [
       { Id ... },
       { fielda1 ... },
       { fielda2 ... },
       { fielda2 ... },
       ...
    ] 
    fieldN:...,
    fieldN1: ...
    }

I only need data inside aList so my pipeline is like so:

    input {
      file {
        path => "/usr/local/Cellar/logstash-full/7.11.0/data/data.json"
        codec => "json"
        sincedb_path => "NULL"
      }
    }

    filter {
      json {
         source => "message" 
      }
      split {
         field => "[aList]"
      }
     if [aList][fielda1] {
         mutate {
             add_field => {
                "fielda1" => "[aList][fielda1]" 
            }
        }
     }
    ....
    remove_field => ["path", "@version", "@timestamp", "host", "message", "field1", ..., "fieldN1"]
    rename => { "[aList][Id]" => "[@metadata][Id]"  }
    ...
    }
    output {
      elasticsearch {
        hosts => ["http://localhost:9200"]
        index => "mydata" 
        document_id => "%{[@metadata][Id]}"
      }
    }
  1. I 'm not sure if split filter is needed if I use input codec json. In any case, adding new fields at the root level keeps aList in the data. If I drop it, then it won't find any aList; I guess the code inside filter is not executed sequentially. In short, how do I get rid of aList once I have copied all its fields at the upper level?

  2. While logstash was processing data since yesterday, today it seems to be stalled and denies to process my conf file. I have removed any sincedb* files, even though there shouldn't be any since I use sincedb_path => "NULL". With --debug I see that it has compiled the filters but then it repeats this:

    [2021-02-20T11:20:59,870][DEBUG][org.logstash.execution.PeriodicFlush][main] Pushing flush onto pipeline.
    [2021-02-20T11:21:01,243][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
    [2021-02-20T11:21:01,244][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
    [2021-02-20T11:21:02,148][DEBUG][filewatch.sincedbcollection][main][dfd2caffb9cafaaef3de80e622b99fe8bd7b28ce8d7353878192e580cdaf3cdd] writing sincedb (delta since last write = 15)
    [2021-02-20T11:21:04,059][DEBUG][logstash.instrument.periodicpoller.cgroup] One or more required cgroup files or directories not found: /proc/self/cgroup, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/cpu
    [2021-02-20T11:57:41,083][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
    [2021-02-20T11:57:41,085][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
    [2021-02-20T11:57:41,596][DEBUG][org.logstash.execution.PeriodicFlush][main] Pushing flush onto pipeline.
    [2021-02-20T11:57:43,487][DEBUG][filewatch.sincedbcollection][main][038696e67175ced52e017c718a78d1c797905ce61b90eba968c0e2ce28cb4e63] writing sincedb (delta since last write = 15)
    [2021-02-20T11:57:45,750][DEBUG][logstash.instrument.periodicpoller.cgroup] One or more required cgroup files or directories not found: /proc/self/cgroup, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/cpu

Any ideas? Thank you in advance.

If you do not want the in-memory sincedb persisted to disk across restarts then on Windows you would us sincedb_path => "NUL" (with one L). On UNIX you should use sincedb_path => "/dev/null".

Thanks for your reply badger. If I use /dev/null in MacOSX I get this error:

Error: Permission denied – Permission denied
Exception: Errno::EACCES

That makes me wonder if you are hitting this bug. Do you get a stack trace?

Regarding the issue that it doesn't import the data to ES anymore, I have found a number of issues about

One or more required cgroup files or directories not found: /proc/self/cgroup, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/cpu

and I don't know what is the solution. Some posts mention that it is not a problem, others that you need to modify jvm.options or logstash.yml. I don't know what is the final solution, definitely, these files don't exist in MacOSX.

Even if I remove sincedb_path => "NULL" and use the default path, I still don't see any progress. However, things are worse, because it seems it is not related to ES. Even with this output:

output {
#  elasticsearch {
#    hosts => ["http://localhost:9200"]
#    index => "flight-data" 
#    document_id => "%{[@metadata][Id]}"
#  }
  file {
    path => "/usr/local/Cellar/logstash-full/7.11.0/data/output.json"
  }
  stdout {
    codec => rubydebug     
  }
}

I don't see any output.json. The strange thing is that I didn't change anything to my machine, and yesterday it was working!!!

[2021-02-20T17:50:23,793][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
[2021-02-20T17:50:24,412][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2021-02-20T17:50:24,414][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2021-02-20T17:50:25,551][DEBUG][org.logstash.execution.PeriodicFlush][main] Pushing flush onto pipeline.
[2021-02-20T17:50:25,597][DEBUG][logstash.outputs.file    ][main] Starting flush cycle
[2021-02-20T17:50:26,786][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>["/usr/local/Cellar/logstash-full/7.11.0/config/data.conf"]}
[2021-02-20T17:50:26,787][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/usr/local/Cellar/logstash-full/7.11.0/config/flight.conf"}
[2021-02-20T17:50:26,792][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
[2021-02-20T17:50:27,394][DEBUG][logstash.instrument.periodicpoller.cgroup] One or more required cgroup files or directories not found: /proc/self/cgroup, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/cpu
[2021-02-20T17:50:27,602][DEBUG][logstash.outputs.file    ][main] Starting flush cycle

Why does the log say:

Skipping the following files while reading config since they don't match the specified glob pattern {:files=>["/usr/local/Cellar/logstash-full/7.11.0/config/data.conf"]}
[2021-02-20T17:58:05,257][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
[2021-02-20T17:58:05,580][DEBUG][org.logstash.execution.PeriodicFlush][main] Pushing flush onto pipeline.
[2021-02-20T17:58:05,671][DEBUG][logstash.outputs.file    ][main] Starting flush cycle

Well, let me kinda try to answer my questions:

  1. sincedb_path => "NULL" isn't the same as /dev/null in MacOSX as I thought; it simply creates a file named NULL in Logstash folder; it is the same as let the default path. This must have been the reason for the stall. So, unless the Permission denied is resolved in Mac, there is no other way than remove the .sincedb file each time, and --config.reload.automatic won't work.
  2. You can get rid of aList simply by adding in the end of the filter plugin:
  mutate {
    remove_field => ["aList"]
  }