i have 4 independent logstash config files which are taking input from a csv file .i am executing 4 files in background by using
logstash -f /path/to/config file.
But when my csv file is updated with new data,then all the configurations accessing at same time , for which at the same time all are trying to load data into elastic . For which my elastic server is going down due to unable to handle these many requestes at same time.
i know for jdbc, scheduling is available but for csv plugin , i did not get any info.
Is there any way to schedule logstash config files to run in a particualr time or any other solution to overcome this problem?
For which my elastic server is going down due to unable to handle these many requestes at same time.
It sounds like you have an underpowered ES server. Does it run out of heap, or why does it crash?
Is there any way to schedule logstash config files to run in a particualr time or any other solution to overcome this problem?
You could use cron, but I think you're trying to solve the wrong problem. Logstash and its file input is meant to continuously monitor files, not batch process files periodically.
yes. i allocated 1 GB heap to elastic . since all the config files sending requests to my elastic server at same time, for that it is going down with out of heap memory . I want to send the config file's requests one after another , after updating csv file.
if i ll do cron job to run logstash config files, then since_db in csv plugin will create a problem, because each time i ll run a logsatsh csv plugin config file, it ll try to create a new since_db , since it is already present,so it ll not execute.
yes. i allocated 1 GB heap to elastic . since all the config files sending requests to my elastic server at same time, for that it is going down with out of heap memory . I want to send the config file's requests one after another , after updating csv file.
Your ES server is underpowered. I strongly suggest you solve that problem instead of working around it by serializing requests.
if i ll do cron job to run logstash config files, then since_db in csv plugin will create a problem, because each time i ll run a logsatsh csv plugin config file, it ll try to create a new since_db , since it is already present,so it ll not execute.
If you want multiple Logstash instances to process the same file they need to have different sincedb files. Use the file input's sincedb_path to explicitly set the per-instance path.
Thanks @magnusbaeck for your quick response.
Yes ! my elasticsearch is underpowered. only 1 gb heap size i allocated. i'll increase the heap size .
for time being , i solved this issue by writing 4 of elastic search index in same logstash file in series. so tht it ll send the file ll send request one after another.
And can you please explain little bit the last line "If you want multiple Logstash instances to process the same file they need to have different sincedb files. Use the file input's sincedb_path to explicitly set the per-instance path."
it seems interesting. i never did this..can you please explain this?
Thanks
And can you please explain little bit the last line "If you want multiple Logstash instances to process the same file they need to have different sincedb files. Use the file input's sincedb_path to explicitly set the per-instance path."
it seems interesting. i never did this..can you please explain this?
I'm not sure what's unclear. The sincedb_path option should be pretty straight-forward.
yeaa , for each logstash config file i am using differnet since_db path.
But if i ll use cron job then, the problem will arise i.e
let cron job has executed config file today , for which since_db is created. when it ll run the same config file tomorrow , the old since_db file ll prevent the config file to run, because it ll try to run the same config file with existing since_db. Thats what i am asking. Can we avoid that since_db path?
As I said, if you want multiple Logstash instances to process the same file they need to have different sincedb files. Use the file input's sincedb_path to explicitly set the per-instance path.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.