Hello people!
I have the following problem:
1 - I have an XMLs with estimated size 69MB and my configuration is as follows:
input {
file {
path => "/tmp/ACOR_2019.XML"
start_position => "beginning"
sincedb_path => "NUL"
max_open_files => 65000
codec => multiline {
charset => "ISO-8859-1"
pattern => "^<ACOR.*"
negate => "true"
what => "previous"
auto_flush_interval => 2
}
}
}
filter {
xml {
source => "message"
force_array => "false"
store_xml => "false"
xpath => [
"/ACOR/DOCN/text()", "DOCN",
"/ACOR/TIPO/text()", "TIPO",
"/ACOR/REG/text()", "REG",
"/ACOR/CLAS/text()", "CLAS",
"/ACOR/DCLA/text()", "DCLA",
"/ACOR/NUM/text()","NUM",
"/ACOR/UF/text()","UF",
"/ACOR/DECI/text()","DECI",
"/ACOR/DTDE/text()","DTDE",
"/ACOR/CORG/text()", "CORG",
"/ACOR/ORG/text()", "ORG",
"/ACOR/EMEN_S/EMEN/text()", "EMEM"
"/ACOR/RELA/text()", "RELA",
"/ACOR/NREL/text()", "NREL",
"/ACOR/REVI/text()","REVI",
"/ACOR/NREV/text()","NREV",
"/ACOR/RACO/text()","RACO",
"/ACOR/NRAC/text()","NRAC",
"/ACOR/INDE_S/INDE/text()","INDE",
"/ACOR/STIP_S/STIP/text()","STIP",
"/ACOR/OPIX/text()","OPTIX",
"/ACOR/DTIX/text()","DTIX",
"/ACOR/SUCE_S/SUCE/text()","SUCE",
"/ACOR/TSUC/text()","TSUC",
"/ACOR/FONT_S/FONT/text()","FONT",
"/ACOR/DTPB/text()","DTPB",
"/ACOR/VEJA/text()","VEJA",
"/ACOR/DOUT_S/DOUT/text()","DOUT",
"/ACOR/REF_S/REF/text()","REF",
"/ACOR/DTIN/text()","DTIN",
"/ACOR/OPIN/text()","OPIN",
"/ACOR/DTAL/text()","DTAL",
"/ACOR/OPAL/text()","OPAL",
"/ACOR/CDOC/text()","CDOC",
"/ACOR/NOTA/text()","NOTA",
"/ACOR/DTRV/text()","DTRV",
"/ACOR/AREV/text()","AREV",
"/ACOR/LINK/text()","LINK",
"/ACOR/COMP/text()","COMP",
"/ACOR/LREF/text()","LREF",
"/ACOR/CLAP/text()","CLAP",
"/ACOR/DCOM/text()","DCON",
"/ACOR/IDTQ/text()","IDTQ",
"/ACOR/NDTQ/text()","NDTQ",
"/ACOR/MDTQ/text()","MDTQ",
"/ACOR/RDTQ/text()","RDTQ",
"/ACOR/INDX/text()","INDX",
"/ACOR/NOTF/text()","NOTF",
"/ACOR/DOUF/text()","DOUF",
"/ACOR/TPOB/text()","TPOB",
"/ACOR/OPCL/text()","OPCL",
"/ACOR/DTCL/text()","DTCL",
"/ACOR/DISP/text()","DISP",
"/ACOR/DTCF/text()","DTCF",
"/ACOR/ACON/text()","ACON",
"/ACOR/IDL/text()","IDL",
"/ACOR/INDW_S/INDW/text()","INDW",
"/ACOR/PART_S/PART/text()","PART",
"/ACOR/FNRE/text()","FNRE",
"/ACOR/TSJU/text()","TSJU",
"/ACOR/CSCO/text()","CSCO",
"/ACOR/DTMN/text()","DTMN",
"/ACOR/AMON/text()","AMON"
]
remove_field => ["message"]
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "ac-%{+yyyy}"
user => "elastic"
password => "XXXXX"
}
# stdout {}
}
2 - When I run Logstash with debug on I see that at some point ingestion and filtering stops running. However no messages are displayed in the logstash logs. Looking at the logs, see what pipeline processing has stopped collecting and filtering as shown in the accompanying figure:
This Filter Process:
note the stop time
This Input Process:
Analyzing my server even the logs showing no more events I see that all processing is being consumed by Logstash as shown below:
As mentioned earlier logstash logs have no errors or anomalies.
3 - When I remove the filter XML messages are processed correctly adding a total of 10604 events and with the filter on it only processes between 1300 to 1400 events
4 - I have a feeling that xml filter can't handle large xml files.
5 - My environment has the following configuration:
CPU 8 Core
6GB Heap
SSD Disk M2
Logstash 7.3.1
6 - This is example my XML:
https://raw.githubusercontent.com/tornis/elastic/master/example.xml
Has anyone had this problem?
Thank you