Logstash - ES index mixed up

Hey folks,

Currently im trying to ingest multiple firewall log source by differencing the folders and index. I created conf file for each source so that it will be ingested separately into different index. However i stumbled into an issue where all the log source are mixed up with the index. In short, all the index contain data from different log source although i separated it in conf level. Can someone point out what would be my mistake here? Following is on the the conf content:

input {

        file {

                path => "/logstash/fortigate_anomaly_log/*.log"

                start_position => "beginning"

                sincedb_path => "/var/lib/logstash/sincedb"

        }

}


filter {



                grok {
                        match => ["message", "%{NOTSPACE:devname} %{NOTSPACE:dev                                                                                                                                                             ice_id} %{IPORHOST:rempip} %{IPORHOST:locip} %{NOTSPACE:msg} %{NOTSPACE:group}"]

                }

                kv {}

                mutate { add_field => { "eventtime" => 1675199470000000000 } }
                mutate { gsub => [ "eventtime", "\d{6}$", "" ] }
                date { match => [ "[eventtime][0]","UNIX_MS","ISO8601" ] target                                                                                                                                                              => "@timestamp" timezone => "UTC" }
                mutate {
                convert => { "sentbyte" => "integer" }
                convert => { "rcvdbyte" => "integer" }
    }

}

output {

        elasticsearch {

                hosts => "localhost"

                index => "fortigate-anomalylog"

        }

        stdout{ codec => rubydebug }

}
path => "/logstash/fortigate_anomaly_log/*.log"

The path is changed for diff log source.

Following is the index screenshot which appear to be mixed up and contain same data across the index:

You have 6 logs which are a different internal structure. You are reading everything from the same directory?, and written in fortigate-anomalylog index.

Cannot get logic why you haven't use IFs and fields from logs to make distinguished from each other also is the output part. Also you can make 6 pipelines and process separately.

Hi Rios,

Thanks for the reply. First, im trying out the ELK stack to ingest the firewall log and honestly I'm not an expert. This is something that i came out by going through blogs and try-error myself. If there is optimization that could be done, sure I will but for now this is merely on testing phase and i really appreciate your comments.

To answer your question, im reading the logs from different sub-directory and each having its own config file.

You must know the field structure to parse correctly. The KV filter is useful however in some cases logs can have sort of header which need to be parsed by grok or dissect.

How are you running logstash? As a systemd service?

If you have multiple configurations and want them to be executed as separated pipelines, you need to configura logstash to run multiple pipelines as explained in the documentation.

Per default logstash will run everything inside /etc/logstash/conf.d as a single pipeline, it will merge all files inside this folder in one pipeline and all data received by the input will pass through all the filters and be sent to all the outputs, unless you use conditionals to filter this.

1 Like

Hi @leandrojmp,

That make sense (definitely) and now i changed the pipeline.yml config. The new issue is the defined pipelines not loaded after i restart the logstash service

curl -XGET "localhost:9600/_node/pipelines?pretty"

"pipelines" : {
    "main" : {
      "ephemeral_id" : "redacted",
      "hash" : "redacted",
      "workers" : 12,
      "batch_size" : 125,
      "batch_delay" : 50,
      "config_reload_automatic" : false,
      "config_reload_interval" : 3000000000,
      "dead_letter_queue_enabled" : false
    }


I even change the the configuration path in logstash.yml (although its not needed, according to chatgpt). Unfortunately no luck and I'm still seeing the main pipeline loaded. Did i miss something here?

image

You are not receiving data if you are using "old" files, recoded in sincedb_path.
Try only with one pipeline, set: sincedb_path => "/dev/null"
If data come to ES, a temp index names, that means your files were processed.

Not sure is the filter section OK or not, since we don't have data, yet.

You didn't said how you are running logstash. Are you running it as a service?

Also, what do you have in the logs after you restarted it? Please share the logs.

There is no main pipeline in your pipelines.yml, so your logstash is not using it, this can happen if you are not running as a service or are using the -f parameter when using the command line.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.