All files collected under on index only. Is it possible to have multiple indexes for multiple files?

EVery time I search GET cat/index? in kibana, I only get one index under all files are collected. I want to get three indexes for three different log files given in filebeat.
This is logstash.conf file

input {
  
  beats {
    port => 5044
  }
}

filter {
 if[log_type] =="access"{
    grok {
	match => {"message" => "%{COMBINEDAPACHELOG}"}
  } else if [log_type] == "errors" {
        grok {
            match => { "message" => "%{COMBINEDAPACHELOG}" }
        }
  }else [log_type] == "dispatcher" {
        grok {
            match => { "message" => "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}\[%{DATA:threadId}]%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}%{JAVACLASS:javaClass}%{SPACE}-%{SPACE}?(\[%{NONNEGINT:incidentId}])%{GREEDYDATA:message}" }
        }
    }
}
 
output {
    elasticsearch {
    hosts => ["localhost:9200"]
    sniffing => true
    manage_template => false
    index    => "%{type}-%{+YYYY.MM.dd}"  
  }
  stdout {
    codec => rubydebug
  }
}

It is possible. If you change your index pattern for separate log types.

Something like this,
index => "%{log_type}-%{+YYYY.MM.dd}"

Hi @mancharagopan, Thanks for replying! Tried this but didnt work:

yellow logstash                 open L6WPHCdMTP6MEJdxKMX5OQ 1 1 41115 0   3.7mb 2019-12-23T07:01:42.362Z

This is the index that gets created when I updated my output to-

output {
    elasticsearch {
    hosts => ["localhost:9200"]
    sniffing => true
    manage_template => false
    index    => "%{log_type}-%{+YYYY.MM.dd}"  
  }
  stdout {
    codec => rubydebug
  }
}

And my filebeat.yml is-

filebeat.inputs:
- 
  paths:
     - /home/mehak/Documents/filebeat-7.4.0-linux-x86_64/logs/log2.log
  enabled: true
  input_type: log
  fields:  
    log_type: access

-
  paths:
     - /home/mehak/Documents/filebeat-7.4.0-linux-x86_64/logs/logz.log
  enabled: true
  input_type: log
  fields:  
     log_type: errors

-
  paths:
     - /home/mehak/Documents/filebeat-7.4.0-linux-x86_64/logs/dispatcher-log.log
  enabled: true
  input_type: log
  fields:  
     log_type: dispatch
  
output.logstash:
  hosts: ["localhost:5044"]

can you run logstash in debug log level and give me the output?

I suspect this should be index => "%{[fields][log_type]}-%{+YYYY.MM.dd}".

Hi @Christian_Dahlqvist, that makes sense! So, i updated the index line and there is still only one index shown in kibana under

yellow logstash-2019.12.23-000001 open DP7Ivo02QDSc9EKAAjZXcw 1 1 49173 0    16mb 2019-12-23T07:29:46.231Z

In discover tab, when i filter the logs by Field.log_type: access/errors/dispatch, the result is shown. But index is still the above one! So if the log files are tagged, then why is index not recognizing the field?

Sure @mancharagopan

[2019-12-22T23:37:58,584][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2019-12-22T23:37:59,165][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2019-12-22T23:37:59,167][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2019-12-22T23:38:02,823][DEBUG][org.logstash.execution.PeriodicFlush][main] Pushing flush onto pipeline.

And it ends with below loop-

[2019-12-22T23:01:36,500][DEBUG][org.logstash.beats.BeatsHandler][main] [local: 127.0.0.1:5044, remote: 127.0.0.1:51268] Sending a new message for the listener, sequence: 2034
[2019-12-22T23:01:36,509][DEBUG][org.logstash.beats.BeatsHandler][main] [local: 127.0.0.1:5044, remote: 127.0.0.1:51268] Sending a new message for the listener, sequence: 2035
[2019-12-22T23:01:36,510][DEBUG][org.logstash.beats.BeatsHandler][main] [local: 127.0.0.1:5044, remote: 127.0.0.1:51268] Sending a new message for the listener, sequence: 2036
[2019-12-22T23:01:36,510][DEBUG][org.logstash.beats.BeatsHandler][main] [local: 127.0.0.1:5044, remote: 127.0.0.1:51268] Sending a new message for the listener, sequence: 2037
[2019-12-22T23:01:36,510][DEBUG][org.logstash.beats.BeatsHandler][main] [local: 127.0.0.1:5044, remote: 127.0.0.1:51268] Sending a new message for the listener, sequence: 2038

Should I provide more of the logs? Please let me know!

@Mehak_Bhargava

run following command to tail logstsh log file on live,
sudo tail -f /var/log/logstash/logstash-plain.log

collect the log while logstash sending your events to elasticsearch and share it.

@mancharagopan, followed the command and updated it with my actual logstash-plain.log instead of /var/log/logstash/logstash-plain.log

mehak@mehak-VirtualBox:~/Documents/logstash-7.4.0/bin$ sudo tail -f /home/mehak/Documents/logstash-7.4.0/logs/logstash-plain.log
/home/mehak/Documents/logstash-7.4.0/vendor/bundle/jruby/2.5.0/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `block in run'
[2019-12-22T23:38:10,579][DEBUG][logstash.outputs.elasticsearch][main] Waiting for in use manticore connections
[2019-12-22T23:38:10,580][DEBUG][logstash.agent           ] 2019-12-22 23:38:10 -0800: Listen loop error: #<IOError: closed stream>
[2019-12-22T23:38:10,581][DEBUG][logstash.agent           ] org/jruby/RubyIO.java:3551:in `select'
/home/mehak/Documents/logstash-7.4.0/vendor/bundle/jruby/2.5.0/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/home/mehak/Documents/logstash-7.4.0/vendor/bundle/jruby/2.5.0/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `block in run'
[2019-12-22T23:38:10,582][DEBUG][logstash.agent           ] 2019-12-22 23:38:10 -0800: Listen loop error: #<IOError: closed stream>
[2019-12-22T23:38:10,583][DEBUG][logstash.agent           ] org/jruby/RubyIO.java:3551:in `select'
/home/mehak/Documents/logstash-7.4.0/vendor/bundle/jruby/2.5.0/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/home/mehak/Documents/logstash-7.4.0/vendor/bundle/jruby/2.5.0/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `block in run'
^C

I had to force quit the command as nothing got printed after 'block in run'. What does this command show here?

@Mehak_Bhargava
Please share the sample of stored data in your current index.

Above command will show your logstash log file live as it get updated. Did you change the location of the log file? by it will store inside above directory.

Did you configure index-pattern in filebeat config file?

@mancharagopan I have always had the logstash-plain file under that directory. This is all in Ubuntu. So when I downloaded and installed everything, my logstash is under Documents folder.

I did not configure index-patter in filebeat config file. How does that work and why in filebeat?
This is from kibana's dicover.

Dec 22, 2019 @ 23:30:20.82608/10/2019 12:15:19 579   (null)                 DEBUG   10   Processing start for record : 136378980959205484#115 ID: 49619767
access
logstash-2019.12.23-000001

Dec 22, 2019 @ 23:30:20.8262019-10-08 12:32:10,480 [Thread-12]             INFO   c.e.d.s.WorkflowInstanceManager - resumeWorkflowInstance; workflowInstanceId=24935899
dispatch
logstash-2019.12.23-000001

Dec 22, 2019 @ 23:30:20.82908/10/2019 12:15:19 599   (null)                 DEBUG   10   Get Incident by id for inbound Activity Start : Incident Id  (CacheRepositoryProxy:start) = 24748988
access
logstash-2019.12.23-000001

This is from GET index/search

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "logstash-2019.12.23-000001",
        "_type" : "_doc",
        "_id" : "yCCpMW8ByDrQo4Z4_MF-",
        "_score" : 1.0,
        "_source" : {
          "ecs" : {
            "version" : "1.1.0"
          },
          "agent" : {
            "type" : "filebeat",
            "ephemeral_id" : "075b5945-9b1c-4cc4-af4b-b70092f369c4",
            "hostname" : "mehak-VirtualBox",
            "version" : "7.4.0",
            "id" : "bad135c8-d359-4936-b515-79eb4bb24630"
          },
          "log" : {
            "offset" : 846421,
            "file" : {
              "path" : "/home/mehak/Documents/filebeat-7.4.0-linux-x86_64/logs/log2.log"
            }
          },
          "tags" : [
            "beats_input_codec_plain_applied"
          ],
          "@version" : "1",
          "fields" : {
            "log_type" : "access"
          },
          "@timestamp" : "2019-12-23T07:30:20.826Z",
          "host" : {
            "name" : "mehak-VirtualBox"
          },
          "message" : "08/10/2019 12:15:19 579   (null)                 DEBUG   10   Processing start for record : 136378980959205484#115 ID: 49619767",
          "type" : "another_test"
        }
      },

Did you try this?

yes, and I still got logstash-date as an index only. Should I add index pattern in filebeat.yml too?

@mancharagopan @Christian_Dahlqvist

Its interesting that I chnaged my entire logstash.conf to have manual name of index and still I get logstash-2019.12.23-000001. Why is this happening?

input {
  file{
     path: /home/mehak/Documents/filebeat-7.4.0-linux-x86_64/logs/log2.log
     add_field => {"Differentindex", "access"}
  }
   file{
    path: /home/mehak/Documents/filebeat-7.4.0-linux-x86_64/logs/logz.log
    add_field => {"Differentindex", "error"}
  }
   file{
    path: /home/mehak/Documents/filebeat-7.4.0-linux-x86_64/logs/dispatcher-log.log
    add_field => {"Differentindex", "dispatch"}
  }
  
}

filter {
  if[add_field] =="access"{
    grok {
	match => {"message" => "%{COMBINEDAPACHELOG}"}
  } else if [add_field] == "errors" {
        grok {
            match => { "message" => "%{COMBINEDAPACHELOG}" }
        }
  }else [add_field] == "dispatcher" {
        grok {
            match => { "message" => "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}\[%{DATA:threadId}]%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}%{JAVACLASS:javaClass}%{SPACE}-%{SPACE}?(\[%{NONNEGINT:incidentId}])%{GREEDYDATA:message}" }
        }
    }
}
 
output {
    
    if [add_field] == "access" {
        elasticsearch {
            hosts => ["localhost:9200"]
            sniffing => true
            manage_template => false
            index => "access-index"
        }
    } else if [add_field] == "errors" {
        elasticsearch {
            hosts => ["localhost:9200"]
            sniffing => true
            manage_template => false
            index => "errors-index"
        }
    }
    
    stdout {
    codec => rubydebug
    } 
}

Can you share the content in one of the log file so i can test it?
Also your filter conditions are wrong. add_field is an option to add a new field. please refer
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html#plugins-inputs-file-add_field

As far as I can see you do not have ant add_field defined which would mean none of those outputs are used. Do you by any chance have any other files in the config directory that have an elasticsearch output?

@Mehak_Bhargava

How many config files you have in logstash config folder?

If you have a single index called logstash then you are seeing ILM in action.

1 Like

Yes, @mancharagopan,
log file 1

08/10/2019 12:14:48 599   (null)                 DEBUG   27   GetUpdatedIncident for Incident Id 24749162 on thread 04fd1833-8275-46ff-816f-9acf0c1f7724:80759 on Thread 27
08/10/2019 12:14:48 600   (null)                 DEBUG   19   Updating cache with activity (152775689) Add Item:True Modify Item: False
08/10/2019 12:14:48 601   (null)                 DEBUG   67   Applying dynamic filter
08/10/2019 12:14:48 604   (null)                 DEBUG   43   GetUpdatedIncidentIndicators for Incident Id 24749170 on thread 04fd1833-8275-46ff-816f-9acf0c1f7724:80752 on Thread 43 Context Id : 4ec7ffe6-431c-4280-bca1-263e20077cbc

In the input section, I can add type instead of add_field

input {
  file{
     path: /home/mehak/Documents/filebeat-7.4.0-linux-x86_64/logs/log2.log
     type => { "access"}

Is this better?

@Christian_Dahlqvist, So I should define add_field outside of teh input section in pipeline.conf here?

I have first-pipeline.conf, second-pipeline.conf, logstash.conf, and pipeline. conf. Below are the contents of it-

input{}
filter{}
output{}
input{}
filter{}
output{}

logstash.conf is below

# Read input from filebeat by listening to port 5044 on which filebeat will send the data
input {
    beats {
        port => "5044"
    }
}
 
filter {
  #If log line contains tab character followed by 'at' then we will tag that entry as stacktrace
    if[type] =="DispatcherApp"{
	grok {
	   match => {"message" => "%{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{USER:extra} %{HTTPDATE:timestamp} \"%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int})"}
        }
	} else if [type] == "IncidentAgent" {
        grok {
            match => { "message" => "%{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{USER:extra} %{HTTPDATE:timestamp} \"%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int})" }
        }
    }
}
output {
   
  stdout {
    codec => rubydebug
  }
 
  # Sending properly parsed log events to elasticsearch
  elasticsearch {
    hosts => ["localhost:9200"]
    sniffing => true
    manage_template => false
    index => "DispApp"
   #index => "%{[@metadata][type]}-%{+YYYY.MM.dd}"
  } 
  elasticsearch {
    hosts => ["localhost:9200"]
    sniffing => true
    manage_template => false
    index => "InciAgent"
}

pipeline.conf is the one I have copied before in comments.