Logstash Error

Hello there,

I am using Elasticsearch three node implementation cluster. I am using version 7.16.2. I have set up this cluster with basic settings. No security has been set up.

Here is the example of data I tried to Ingest.

Abraham	Jose	A	19551131
Aguilar	Drazenko	A	18911530
Aldaco	Frank	A	14511025
Alejandro	Bruce	A	13427703
Alshefski	Aaku	A	11234551
Alzer	Mason	A	80200418
Ancsanyi	Dennis	A	17465939
Anderson	Florenti	A	17485930
Anderson	Henner	A	16784032

Here is the config file I used

input {
	stdin { 
		codec => line {
			charset => "UTF-8"
		}
	}
}

filter {
	# The fingerprint filter creates a unique identifier that is used as the document id. 
	# This creates a hash key based on the content message that is used as a unique id/key for each elasticsearch entry. 
	 
	fingerprint { 
		source => "message"
		target => "[@metadata][fingerprint]"
		method => "SHA1"
		# For the key we use the name of the index followed by the unique string on the first line of the csv data file.  
		key => "traveller_no_dups"
		base64encode => true
	}
	
	# Defines all the field in the csv file in the order they are found. 
	
    csv {
        separator => ","
		columns => [
			"SURNAME", 
			"FIRST_NAME", 
			"MIDDLE_NAME", 
			"BIRTHDATE", 
		]
	}

	# Add new DOB field will hold the BIRTHDATE content. 
	mutate {
		add_field => { "DOB" => "%{BIRTHDATE}" }
	}
	
	#	Process the birthdate as DOB. Convert the birthdate into a date value. 
	date {
		match => [ "DOB", "yyyyMMdd"]
		target => "DOB"
	}

	# remove all fields we dont need anymore. 
	mutate { 
		remove_field => [ "BIRTHDATE" ]
	}	
}

output { 
     elasticsearch {
            action => "index"
            hosts => "localhost:9200"
            index => "traveller_no_dups"
			document_id => "%{[@metadata][fingerprint]}"
       }
        stdout {codec => rubydebug}
#        stdout {}
}


So, when I tried to Ingest data, it gave me following errors.
This loop would load the first 9 million rows

I tried the same process with version 8.0.0 but it gave me same errors.
The same script works on a single node implementation and the version is 7.5.

PS D:\APPS\ELK7.16.2\logstash-7.16.2> $Current_time = Get-Date
"Start run 1 - 9 @ " + $Current_time
$stopwatch_all = [System.Diagnostics.Stopwatch]::StartNew()
" "
For ($i = 1; $i -lt 10; $i++) {
   $Current_time = Get-Date
   "Start file $i @ " + $Current_time
   $stopwatch = [System.Diagnostics.Stopwatch]::StartNew()
   & type E:\No_Duplicates\Split_Files\PH_MODEL_NAMES_NO_DUPS_SPLIT_00$i.csv | .\bin\logstash.bat –f E:\Scripts\TRAV_NO_DUPS.conf > E:\No_Duplicates\Ingest_Runs\LOGSTASH_NO_DUPS_00$i.txt
   "File PH_MODEL_NAMES_NO_DUPS_SPLIT_00$i.csv"
   $Current_time = Get-Date
   "End file $i @ " + $Current_time
   "Elapsed time: " + $stopwatch.Elapsed.ToString()
   $stopwatch.Stop()
   " "
}
"Overall time 1 - 9: " + $stopwatch_all.Elapsed.ToString()
$stopwatch_all.Stop()
" "

Start run 1 - 9 @ 02/18/2022 14:46:54

Start file 1 @ 02/18/2022 14:46:54
.\bin\logstash.bat : OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be 
removed in a future release.
At line:9 char:113
+ ... _00$i.csv | .\bin\logstash.bat –f E:\Scripts\TR ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (OpenJDK 64-Bit ...future release.:String) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError

ERROR: Unknown command '–f'
See: 'bin/logstash --help'
File PH_MODEL_NAMES_NO_DUPS_SPLIT_001.csv
End file 1 @ 02/18/2022 14:49:08
Elapsed time: 00:02:14.3544994

Start file 2 @ 02/18/2022 14:49:08

The error in logfile is below.

"Using bundled JDK: ."
[FATAL] 2022-02-22 14:40:47.126 [main] Logstash - Logstash stopped processing because of an error: (SystemExit) exit
org.jruby.exceptions.SystemExit: (SystemExit) exit
	at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:747) ~[jruby-complete-9.2.20.1.jar:?]
	at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:710) ~[jruby-complete-9.2.20.1.jar:?]
	at D_3a_.APPS.ELK8_dot_0_dot_0.logstash_minus_8_dot_0_dot_0.vendor.bundle.jruby.$2_dot_5_dot_0.gems.clamp_minus_1_dot_0_dot_1.lib.clamp.command.run(D:/APPS/ELK8.0.0/logstash-8.0.0/vendor/bundle/jruby/2.5.0/gems/clamp-1.0.1/lib/clamp/command.rb:138) ~[?:?]
	at D_3a_.APPS.ELK8_dot_0_dot_0.logstash_minus_8_dot_0_dot_0.lib.bootstrap.environment.<main>(D:\APPS\ELK8.0.0\logstash-8.0.0\lib\bootstrap\environment.rb:93) ~[?:?]

•	Is it a code page issue
 	   ERROR: Unknown command '–f'
•	It fails at the Logstash call
        At line:9 char:113
	    .\bin\logstash.bat –f E:\Scripts\TR ...
•	Is it a JDK error
	     + CategoryInfo          : NotSpecified: (OpenJDK 64-Bit ...future release.:String) [], 
         RemoteException
	     + FullyQualifiedErrorId : NativeCommandError

Please let me know if someone can help!

Thank you,
Akhil

ERROR: Unknown command '–f' looks very similar to this. Do you have –f instead of -f?

That will work, but I would suggest

date {
	match => [ "BIRTHDATE", "yyyyMMdd"]
	target => "DOB"
	remove_field => [ "BIRTHDATE" ]
}

That will leave the [BIRTHDATE] field intact if a date filter is unable to parse it. So if someone sends you dodgy data you will be able to see what is wrong with it.

Thanks Badger! This worked and my issue is resolved. :grinning: