Running logstash properly

I have installed my logstash using rpm and enabled it using systemctl. I have to stash a csv file and create table in grafana daily. Do i have to run bin/logstash -f /etc/logstash/conf.d/logstash.config via cron if i want to stash sa csv file hourly?. I am not seeing any content my grafana when i am about to add a table panel.

2018-03-15_12-12-29

index are getting created.

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open .kibana _G9Bg9S8SruqNG12CBrU2g 1 0 2 1 16kb 16kb
yellow open implementationcrq-2018.03.15 H7B50zQZQsuwHsx5XHK2pA 5 1 43 0 88.4kb 88.4kb
yellow open pendingcrq-2018.03.15 Nnb8_XnVQQmxPMzpBJO9tA 5 1 2 0 19.1kb 19.1kb
yellow open completedcrq-2018.03.15 5FIT8AxsSfKry22xKEleHQ 5 1 16 0 85.8kb 85.8kb
yellow open closedcrq-2018.03.15 ecf0hIpZSUCObNR7hDZQKg 5 1 53 0 129.7kb 129.7kb

You can keep Logstash running all the time and configure a file input to read *.csv or whatever the files are named. When you want Logstash to process a file just copy it to the directory that Logstash is monitoring.

i have created a cron to run my logstash config every 3mins. Is that allowed? in my logs i am seeing this.

[2018-03-18T22:03:11,256][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-03-18T22:03:11,262][FATAL][logstash.runner ] Logstash could not be started because there is already another instance using the configured data directory. If you wish to run multiple instances, you must change the "path.data" setting.
[2018-03-18T22:03:11,265][ERROR][org.logstash.Logstash ] java.lang.IllegalStateException: org.jruby.exceptions.RaiseException: (SystemExit) exit
[2018-03-18T22:06:11,320][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"/usr/share/logstash/modules/fb_apache/configuration"}

Should enable systemctl for logstash service rather than running the logstash using cron?

Yes, run Logstash as a background service instead and follow the advice I gave earlier. Logstash doesn't shut down when it has processed all data via the file input so starting it every three minutes doesn't make sense.

Thanks. Follow up question. I am receiving this error in logs "the type event field won't be used to determine the document _type"

what's wrong in my logstash config?

input {
file {
type => "samscrq"
path => "/samba/CRQ_reports/samscrq.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_samscrq"
}
file {
type => "pending"
path => "/samba/CRQ_reports/pending.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_pending"
}
file {
type => "implementation"
path => "/samba/CRQ_reports/implementation.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_implementation"
}
file {
type => "completed"
path => "/samba/CRQ_reports/completed.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_completed"
}
file {
type => "closed"
path => "/samba/CRQ_reports/closed.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_closed"
}
}
filter {
if [type] == "samscrq" {
csv {
separator => ","
columns => ["change_id","summary","Status","start_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
}
if [type] == "completed" {
csv {
separator => ","
columns => ["change_id","summary","Status","start_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
}
if [type] == "closed" {
csv {
separator => ","
columns => ["change_id","summary","Status","start_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
}
if [type] == "implementation" {
csv {
separator => ","
columns => ["change_id","summary","Status","start_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
}
if [type] == "pending" {
csv {
separator => ","
columns => ["change_id","summary","Status","start_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
}
}
output {
if [type] == "samscrq" {
elasticsearch {
hosts => ["146.40.233.10:9200"]
index => "samscrq"
}
}
if [type] == "completed" {
elasticsearch {
hosts => ["146.40.233.10:9200"]
index => "completed"
}
}
if [type] == "closed" {
elasticsearch {
hosts => ["146.40.233.10:9200"]
index => "closed"
}
}
if [type] == "implementation" {
elasticsearch {
hosts => ["146.40.233.10:9200"]
index => "implementation"
}
}
if [type] == "pending" {
elasticsearch {
hosts => ["146.40.233.10:9200"]
index => "pending"
}
}
}

It means exactly what it says. The value of the type field will no longer by default be used to determine the type of the documents. If this really is what you want you can use

document_type => "%{type}"

in your elasticsearch output configuration. Here's the relevant code:

what should be my correct approach? this is the scenario. every 2 hours a CSV file is dumped in my samba directory. I have a python scrupt parses the CSV file and create a new CSV file (pending.csv, completed.csv and implementation.csv). From this new input files i would like to create 3 different indeces that i will be using for my table panel in grafana. I want to display this in my dashboard with table panels.

i have update i used type instead of type. but still same error "the type event field won't be used to determine the document _type"

input {
file {
type => "samscrq"
path => "/samba/CRQ_reports/samscrq.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_samscrq"
}
file {
type => "pending"
path => "/samba/CRQ_reports/pending.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_pending"
}
file {
type => "implementation"
path => "/samba/CRQ_reports/implementation.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_implementation"
}
file {
type => "completed"
path => "/samba/CRQ_reports/completed.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_completed"
}
file {
type => "closed"
path => "/samba/CRQ_reports/closed.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_closed"
}
}
filter {
if [type] == "samscrq" {
csv {
separator => ","
columns => ["change_id","summary","Status","start_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
}
if [type] == "completed" {
csv {
separator => ","
columns => ["change_id","summary","Status","start_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
}
if [type] == "closed" {
csv {
separator => ","
columns => ["change_id","summary","Status","start_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
}
if [type] == "implementation" {
csv {
separator => ","
columns => ["change_id","summary","Status","start_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
}
if [type] == "pending" {
csv {
separator => ","
columns => ["change_id","summary","Status","start_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
}
}
output {
if [type] == "samscrq" {
elasticsearch {
hosts => ["XXX.XX.XXX.10:9200"]
index => "samscrq"
}
}
if [type] == "completed" {
elasticsearch {
hosts => ["XXX.XX.XXX.10:9200"]
index => "completed"
}
}
if [type] == "closed" {
elasticsearch {
hosts => ["XXX.XX.XXX.10:9200"]
index => "closed"
}
}
if [type] == "implementation" {
elasticsearch {
hosts => ["XXX.XX.XXX.10:9200"]
index => "implementation"
}
}
if [type] == "pending" {
elasticsearch {
hosts => ["XXX.XX.XXX.10:9200"]
index => "pending"
}
}
}

i have update i used type instead of type.

What?

but still same error "the type event field won't be used to determine the document _type"

It's not an error, it's a warning. If you don't want to see the warning you can set document_type to something.

I see. I thought its already an error and config was not good at all.

The config earlier was using tag it was supposedly type :slight_smile: sorry for that

What is your python script doing to the incoming CSV file? Is it using the Status field to decide which file to write the line to?
If so, then Logstash can do all of what you want to do in a single pass of the incoming file.
I am sure you can use Logstash functions to send each event to the correct index on-the-fly using string interpolation.
Perhaps:

input {
  file {
    type => "samscrq"
    path => "/samba/CRQ_reports/incoming.csv"
    start_position => "beginning"
    sincedb_path => "/opt/sincedb/.sincedb_crq"
  }
}
filter {
  csv {
    separator => ","
    columns => ["change_id","summary","status","start_date","coordinator"]
    skip_empty_columns => true
    skip_empty_rows => true
  }
  mutate {
    lowercase => ["status"]
  }
}
output {
  elasticsearch {
    hosts => ["XXX.XX.XXX.10:9200"]
    index => "%{status}"
  }
}

yes it is using the status field to decide which to write the line to. I'll try to use this logstash config. I'll let you know of the outcome. Thanks

I tried running below config. see below pic i am seeing this type of indeces.

input {
file {
type => "samscrq"
path => "/samba/CRQ_reports/*.csv"
start_position => "beginning"
sincedb_path => "/opt/sincedb/.sincedb_crq"
}
}
filter {
csv {
separator => ","
columns => ["change_id","summary","status","start_date","end_date","coordinator"]
skip_empty_columns => true
skip_empty_rows => true
}
mutate {
lowercase => ["status"]
}
}
output {
elasticsearch {
hosts => ["XXX.XX.XXX.XX:9200"]
index => "%{status}"
}
}

If you replace the elastics watch output with
stout { codec => { rubydebug }}
What does one event look like, copy paste it here.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.