HI,
I'm indexing a XML file and given ElasticSearch as output but after approx. 2 hours i got OOM error like this
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid6940.hprof ...
Heap dump file created [9277681492 bytes in 117.530 secs]
[2017-08-04T23:02:13,255][ERROR][logstash.pipeline ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. {"exception"=>java.lang.OutOfMemoryError: Java heap space, "backtrace"=>[]}
[2017-08-04T23:02:15,181][ERROR][logstash.pipeline ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash.
Error: Your application used more memory than the safety cap of 6G.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace
Here is my configuration file:
input
{
file {
path => "path/drug.xml"
type => "drugbank"
start_position => beginning
sincedb_path => "/dev/null"
codec => multiline
{
pattern => "^<\?drugbank.*\>"
negate => true
what => "previous"
max_bytes => "400 MiB"
max_lines => "5000000000"
}
}
}
filter {
xml {
source => "message"
target => "xmldata"
store_xml => "false"
xpath => [ "/drugbank/drug", "drug"]
}
mutate {
remove_field => [ "message", "inxml", "xmldata" ]
}
split {
field => "[drug]"
}
xml {
source => "drug"
store_xml => "false"
xpath => [ "/drug/drugbank-id/text()", "Drug ID"]
xpath => [ "/drug/name/text()", "Drug name"]
xpath => [ "/drug/targets/target/polypeptide/gene-name/text()", "Gene"]
}
mutate {
replace => {
"Drug ID" => "%{[Drug ID][0]}"
"Drug name" => "%{[Drug name][0]}"
"Gene" => "%{[Gene][0]}"
}
}
mutate {
remove_field => [ "drug"]
}
}
output
{
elasticsearch {
codec => json
hosts => ["10.xx.xx.xx2","10.xx.xx.xx4"]
index => "test_index"
}
}
Any suggestion much appreciated.
Thanks