Hello there,
I am using Elasticsearch three node implementation cluster. I am using version 7.16.2. I have set up this cluster with basic settings. No security has been set up.
Here is the example of data I tried to Ingest.
Abraham Jose A 19551131
Aguilar Drazenko A 18911530
Aldaco Frank A 14511025
Alejandro Bruce A 13427703
Alshefski Aaku A 11234551
Alzer Mason A 80200418
Ancsanyi Dennis A 17465939
Anderson Florenti A 17485930
Anderson Henner A 16784032
Here is the config file I used
input {
stdin {
codec => line {
charset => "UTF-8"
}
}
}
filter {
# The fingerprint filter creates a unique identifier that is used as the document id.
# This creates a hash key based on the content message that is used as a unique id/key for each elasticsearch entry.
fingerprint {
source => "message"
target => "[@metadata][fingerprint]"
method => "SHA1"
# For the key we use the name of the index followed by the unique string on the first line of the csv data file.
key => "traveller_no_dups"
base64encode => true
}
# Defines all the field in the csv file in the order they are found.
csv {
separator => ","
columns => [
"SURNAME",
"FIRST_NAME",
"MIDDLE_NAME",
"BIRTHDATE",
]
}
# Add new DOB field will hold the BIRTHDATE content.
mutate {
add_field => { "DOB" => "%{BIRTHDATE}" }
}
# Process the birthdate as DOB. Convert the birthdate into a date value.
date {
match => [ "DOB", "yyyyMMdd"]
target => "DOB"
}
# remove all fields we dont need anymore.
mutate {
remove_field => [ "BIRTHDATE" ]
}
}
output {
elasticsearch {
action => "index"
hosts => "localhost:9200"
index => "traveller_no_dups"
document_id => "%{[@metadata][fingerprint]}"
}
stdout {codec => rubydebug}
# stdout {}
}
So, when I tried to Ingest data, it gave me following errors.
This loop would load the first 9 million rows
I tried the same process with version 8.0.0 but it gave me same errors.
The same script works on a single node implementation and the version is 7.5.
PS D:\APPS\ELK7.16.2\logstash-7.16.2> $Current_time = Get-Date
"Start run 1 - 9 @ " + $Current_time
$stopwatch_all = [System.Diagnostics.Stopwatch]::StartNew()
" "
For ($i = 1; $i -lt 10; $i++) {
$Current_time = Get-Date
"Start file $i @ " + $Current_time
$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()
& type E:\No_Duplicates\Split_Files\PH_MODEL_NAMES_NO_DUPS_SPLIT_00$i.csv | .\bin\logstash.bat –f E:\Scripts\TRAV_NO_DUPS.conf > E:\No_Duplicates\Ingest_Runs\LOGSTASH_NO_DUPS_00$i.txt
"File PH_MODEL_NAMES_NO_DUPS_SPLIT_00$i.csv"
$Current_time = Get-Date
"End file $i @ " + $Current_time
"Elapsed time: " + $stopwatch.Elapsed.ToString()
$stopwatch.Stop()
" "
}
"Overall time 1 - 9: " + $stopwatch_all.Elapsed.ToString()
$stopwatch_all.Stop()
" "
Start run 1 - 9 @ 02/18/2022 14:46:54
Start file 1 @ 02/18/2022 14:46:54
.\bin\logstash.bat : OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be
removed in a future release.
At line:9 char:113
+ ... _00$i.csv | .\bin\logstash.bat –f E:\Scripts\TR ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (OpenJDK 64-Bit ...future release.:String) [], RemoteException
+ FullyQualifiedErrorId : NativeCommandError
ERROR: Unknown command '–f'
See: 'bin/logstash --help'
File PH_MODEL_NAMES_NO_DUPS_SPLIT_001.csv
End file 1 @ 02/18/2022 14:49:08
Elapsed time: 00:02:14.3544994
Start file 2 @ 02/18/2022 14:49:08
The error in logfile is below.
"Using bundled JDK: ."
[FATAL] 2022-02-22 14:40:47.126 [main] Logstash - Logstash stopped processing because of an error: (SystemExit) exit
org.jruby.exceptions.SystemExit: (SystemExit) exit
at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:747) ~[jruby-complete-9.2.20.1.jar:?]
at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:710) ~[jruby-complete-9.2.20.1.jar:?]
at D_3a_.APPS.ELK8_dot_0_dot_0.logstash_minus_8_dot_0_dot_0.vendor.bundle.jruby.$2_dot_5_dot_0.gems.clamp_minus_1_dot_0_dot_1.lib.clamp.command.run(D:/APPS/ELK8.0.0/logstash-8.0.0/vendor/bundle/jruby/2.5.0/gems/clamp-1.0.1/lib/clamp/command.rb:138) ~[?:?]
at D_3a_.APPS.ELK8_dot_0_dot_0.logstash_minus_8_dot_0_dot_0.lib.bootstrap.environment.<main>(D:\APPS\ELK8.0.0\logstash-8.0.0\lib\bootstrap\environment.rb:93) ~[?:?]
• Is it a code page issue
ERROR: Unknown command '–f'
• It fails at the Logstash call
At line:9 char:113
.\bin\logstash.bat –f E:\Scripts\TR ...
• Is it a JDK error
+ CategoryInfo : NotSpecified: (OpenJDK 64-Bit ...future release.:String) [],
RemoteException
+ FullyQualifiedErrorId : NativeCommandError
Please let me know if someone can help!
Thank you,
Akhil