Logstash - ES index mixed up

maskrider1111 · March 20, 2024, 9:27am

Hey folks,

Currently im trying to ingest multiple firewall log source by differencing the folders and index. I created conf file for each source so that it will be ingested separately into different index. However i stumbled into an issue where all the log source are mixed up with the index. In short, all the index contain data from different log source although i separated it in conf level. Can someone point out what would be my mistake here? Following is on the the conf content:

input {

        file {

                path => "/logstash/fortigate_anomaly_log/*.log"

                start_position => "beginning"

                sincedb_path => "/var/lib/logstash/sincedb"

        }

}


filter {



                grok {
                        match => ["message", "%{NOTSPACE:devname} %{NOTSPACE:dev                                                                                                                                                             ice_id} %{IPORHOST:rempip} %{IPORHOST:locip} %{NOTSPACE:msg} %{NOTSPACE:group}"]

                }

                kv {}

                mutate { add_field => { "eventtime" => 1675199470000000000 } }
                mutate { gsub => [ "eventtime", "\d{6}$", "" ] }
                date { match => [ "[eventtime][0]","UNIX_MS","ISO8601" ] target                                                                                                                                                              => "@timestamp" timezone => "UTC" }
                mutate {
                convert => { "sentbyte" => "integer" }
                convert => { "rcvdbyte" => "integer" }
    }

}

output {

        elasticsearch {

                hosts => "localhost"

                index => "fortigate-anomalylog"

        }

        stdout{ codec => rubydebug }

}

path => "/logstash/fortigate_anomaly_log/*.log"

The path is changed for diff log source.

Following is the index screenshot which appear to be mixed up and contain same data across the index:

Rios · March 20, 2024, 9:50am

You have 6 logs which are a different internal structure. You are reading everything from the same directory?, and written in fortigate-anomalylog index.

Cannot get logic why you haven't use IFs and fields from logs to make distinguished from each other also is the output part. Also you can make 6 pipelines and process separately.

maskrider1111 · March 20, 2024, 10:19am

Hi Rios,

Thanks for the reply. First, im trying out the ELK stack to ingest the firewall log and honestly I'm not an expert. This is something that i came out by going through blogs and try-error myself. If there is optimization that could be done, sure I will but for now this is merely on testing phase and i really appreciate your comments.

To answer your question, im reading the logs from different sub-directory and each having its own config file.

Rios · March 20, 2024, 12:22pm

You must know the field structure to parse correctly. The KV filter is useful however in some cases logs can have sort of header which need to be parsed by grok or dissect.

leandrojmp · March 20, 2024, 12:58pm

How are you running logstash? As a systemd service?

If you have multiple configurations and want them to be executed as separated pipelines, you need to configura logstash to run multiple pipelines as explained in the documentation.

Per default logstash will run everything inside /etc/logstash/conf.d as a single pipeline, it will merge all files inside this folder in one pipeline and all data received by the input will pass through all the filters and be sent to all the outputs, unless you use conditionals to filter this.

maskrider1111 · March 20, 2024, 2:45pm

Hi @leandrojmp,

That make sense (definitely) and now i changed the pipeline.yml config. The new issue is the defined pipelines not loaded after i restart the logstash service

curl -XGET "localhost:9600/_node/pipelines?pretty"

"pipelines" : {
    "main" : {
      "ephemeral_id" : "redacted",
      "hash" : "redacted",
      "workers" : 12,
      "batch_size" : 125,
      "batch_delay" : 50,
      "config_reload_automatic" : false,
      "config_reload_interval" : 3000000000,
      "dead_letter_queue_enabled" : false
    }

I even change the the configuration path in logstash.yml (although its not needed, according to chatgpt). Unfortunately no luck and I'm still seeing the main pipeline loaded. Did i miss something here?

Rios · March 20, 2024, 4:15pm

You are not receiving data if you are using "old" files, recoded in sincedb_path.
Try only with one pipeline, set: sincedb_path => "/dev/null"
If data come to ES, a temp index names, that means your files were processed.

Not sure is the filter section OK or not, since we don't have data, yet.

leandrojmp · March 20, 2024, 4:49pm

You didn't said how you are running logstash. Are you running it as a service?

Also, what do you have in the logs after you restarted it? Please share the logs.

There is no main pipeline in your pipelines.yml, so your logstash is not using it, this can happen if you are not running as a service or are using the -f parameter when using the command line.

system · April 17, 2024, 4:49pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Why the same logs ingested over multiple index Elasticsearch	5	344	August 25, 2020
[Solved] Logstash sends logs to the same elasticsearch index even after I configured not to do so Logstash	3	617	May 24, 2018
Logstash sending all data to elasticsearch via wrong pipeline Logstash	11	1318	February 25, 2019
Logstash different .conf inputs are mixing up entries in indexes Logstash	4	988	April 24, 2019
Logstash sending logs to wrong indices Logstash	3	1670	December 31, 2017

Logstash - ES index mixed up

Related topics