Logstash not creating right number of documents when passing folder as an argument

saisn · December 21, 2016, 11:25pm

Hi,

What I'm doing: Created 4 conf files under a root directory called XYZ. Each of these conf files will import 1000 rows from SQL and tables being imported are unique in all 4 conf files. when ran separately the number of documents created are 1000 in each index but when running conf file with root folder as argument the count is not 1000 in indexes.
I've also observed each index is picking different document. I'm also using templates for each of the index and template name is different all across.

I've given different index names and different document names in all configfiles. But, somehow the number of documents created in each of the indexes is different from expected.
However, when i run configfiles seperately the number of docs created are correct

jsvd · December 22, 2016, 9:19am

I'm not sure what is happening without seeing the configuration files, but remember that in logstash, if you run it with multiple configuration files, they will all be concatenated and evaluated as a single one.
So, be sure to confirm if you have unnecessarily duplicate input/filter/output sections.

saisn · December 22, 2016, 6:29pm

i cant share the whole config file but each conf file is fetching data from single table and writing to a single table. Similar to below i have 4 conf files for each table under a root directory

input {
jdbc {
jdbc_driver_library => "/h"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://"
jdbc_user => "ReadOnly"
jdbc_password => ""
#lowercase_column_names => false
#schedule => "*/10 * * * *"
clean_run => true
use_column_value => true
tracking_column => ***
record_last_run => true
#used for incremental updates of record by having a reference point in ../jinfo
last_run_metadata_path => "/etc/logstash/run_metadata.d/"
statement => "SELECT *
FROM [a] where [a] > :sql_last_value order by [amd] asc"
#jdbc_paging_enabled => "true"
#jdbc_page_size => "50000"
#statement_filepath => "query.sql"
}
}
filter {

}
output {
elasticsearch {
user => ""
password => ""
#ssl => true
#ssl_certificate_verification => true
truststore => ""
truststore_password => ""
hosts => [""]
index => "h1"
document_type => ""
document_id =>"%{cd}"
#protocol => "http"
}

}

jsvd · December 23, 2016, 10:39am

when you say different count than expected, is it more or less?

saisn · December 24, 2016, 4:40pm

it's more.

jsvd · December 27, 2016, 10:37am

does each of your individual configuration files, have the same structure like the one below?

input {
  jdbc {
    # ...
  }
}
filter {
}
output {
  elasticsearch {
     # ...
  }
}

If so, when you execute bin/logstash -f *.conf, all your N files will be merged into one, which means that you now have N jdbc blocks, but also N elasticsearch blocks, which means that for each event one of the jdbc blocks produces, you're sending it N times instead of one.

you need 1 file with:

filter {

}
output {
  elasticsearch {
    user => ""
    password => ""
    truststore => ""
    truststore_password => ""
    hosts => [""]
    index => "h1"
    document_type => ""
    document_id =>"%{cd}"
    #protocol => "http"
  }
}

and then N files, each with just the jdbc:

input {
  jdbc {
  }
}

system · January 24, 2017, 10:38am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash 2.1.1 - weird results with multiple conf files on WIndows Logstash	5	735	July 6, 2017
Logstash creating more rows than the source table Logstash	2	640	September 29, 2017
Difference between Doc count and Number of events sent by a collector Elasticsearch	5	390	May 12, 2020
Logstash sends multiple copies of data Logstash	3	547	January 8, 2019
Limit the number of config files logstash reads from a directory Logstash	5	555	December 1, 2017

Logstash not creating right number of documents when passing folder as an argument

Related topics