Logstash Error

Hello there,

I am using Elasticsearch three node implementation cluster. I am using version 7.16.2. I have set up this cluster with basic settings. No security has been set up.

Here is the example of data I tried to Ingest.

Abraham	Jose	A	19551131
Aguilar	Drazenko	A	18911530
Aldaco	Frank	A	14511025
Alejandro	Bruce	A	13427703
Alshefski	Aaku	A	11234551
Alzer	Mason	A	80200418
Ancsanyi	Dennis	A	17465939
Anderson	Florenti	A	17485930
Anderson	Henner	A	16784032

Here is the config file I used

input {
	stdin { 
		codec => line {
			charset => "UTF-8"
		}
	}
}

filter {
	# The fingerprint filter creates a unique identifier that is used as the document id. 
	# This creates a hash key based on the content message that is used as a unique id/key for each elasticsearch entry. 
	 
	fingerprint { 
		source => "message"
		target => "[@metadata][fingerprint]"
		method => "SHA1"
		# For the key we use the name of the index followed by the unique string on the first line of the csv data file.  
		key => "traveller_no_dups"
		base64encode => true
	}
	
	# Defines all the field in the csv file in the order they are found. 
	
    csv {
        separator => ","
		columns => [
			"SURNAME", 
			"FIRST_NAME", 
			"MIDDLE_NAME", 
			"BIRTHDATE", 
		]
	}

	# Add new DOB field will hold the BIRTHDATE content. 
	mutate {
		add_field => { "DOB" => "%{BIRTHDATE}" }
	}
	
	#	Process the birthdate as DOB. Convert the birthdate into a date value. 
	date {
		match => [ "DOB", "yyyyMMdd"]
		target => "DOB"
	}

	# remove all fields we dont need anymore. 
	mutate { 
		remove_field => [ "BIRTHDATE" ]
	}	
}

output { 
     elasticsearch {
            action => "index"
            hosts => "localhost:9200"
            index => "traveller_no_dups"
			document_id => "%{[@metadata][fingerprint]}"
       }
        stdout {codec => rubydebug}
#        stdout {}
}


So, when I tried to Ingest data, it gave me following errors.
This loop would load the first 9 million rows

I tried the same process with version 8.0.0 but it gave me same errors.
The same script works on a single node implementation and the version is 7.5.

PS D:\APPS\ELK7.16.2\logstash-7.16.2> $Current_time = Get-Date
"Start run 1 - 9 @ " + $Current_time
$stopwatch_all = [System.Diagnostics.Stopwatch]::StartNew()
" "
For ($i = 1; $i -lt 10; $i++) {
   $Current_time = Get-Date
   "Start file $i @ " + $Current_time
   $stopwatch = [System.Diagnostics.Stopwatch]::StartNew()
   & type E:\No_Duplicates\Split_Files\PH_MODEL_NAMES_NO_DUPS_SPLIT_00$i.csv | .\bin\logstash.bat –f E:\Scripts\TRAV_NO_DUPS.conf > E:\No_Duplicates\Ingest_Runs\LOGSTASH_NO_DUPS_00$i.txt
   "File PH_MODEL_NAMES_NO_DUPS_SPLIT_00$i.csv"
   $Current_time = Get-Date
   "End file $i @ " + $Current_time
   "Elapsed time: " + $stopwatch.Elapsed.ToString()
   $stopwatch.Stop()
   " "
}
"Overall time 1 - 9: " + $stopwatch_all.Elapsed.ToString()
$stopwatch_all.Stop()
" "

Start run 1 - 9 @ 02/18/2022 14:46:54

Start file 1 @ 02/18/2022 14:46:54
.\bin\logstash.bat : OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be 
removed in a future release.
At line:9 char:113
+ ... _00$i.csv | .\bin\logstash.bat –f E:\Scripts\TR ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (OpenJDK 64-Bit ...future release.:String) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError

ERROR: Unknown command '–f'
See: 'bin/logstash --help'
File PH_MODEL_NAMES_NO_DUPS_SPLIT_001.csv
End file 1 @ 02/18/2022 14:49:08
Elapsed time: 00:02:14.3544994

Start file 2 @ 02/18/2022 14:49:08

The error in logfile is below.

"Using bundled JDK: ."
[FATAL] 2022-02-22 14:40:47.126 [main] Logstash - Logstash stopped processing because of an error: (SystemExit) exit
org.jruby.exceptions.SystemExit: (SystemExit) exit
	at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:747) ~[jruby-complete-9.2.20.1.jar:?]
	at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:710) ~[jruby-complete-9.2.20.1.jar:?]
	at D_3a_.APPS.ELK8_dot_0_dot_0.logstash_minus_8_dot_0_dot_0.vendor.bundle.jruby.$2_dot_5_dot_0.gems.clamp_minus_1_dot_0_dot_1.lib.clamp.command.run(D:/APPS/ELK8.0.0/logstash-8.0.0/vendor/bundle/jruby/2.5.0/gems/clamp-1.0.1/lib/clamp/command.rb:138) ~[?:?]
	at D_3a_.APPS.ELK8_dot_0_dot_0.logstash_minus_8_dot_0_dot_0.lib.bootstrap.environment.<main>(D:\APPS\ELK8.0.0\logstash-8.0.0\lib\bootstrap\environment.rb:93) ~[?:?]

•	Is it a code page issue
 	   ERROR: Unknown command '–f'
•	It fails at the Logstash call
        At line:9 char:113
	    .\bin\logstash.bat –f E:\Scripts\TR ...
•	Is it a JDK error
	     + CategoryInfo          : NotSpecified: (OpenJDK 64-Bit ...future release.:String) [], 
         RemoteException
	     + FullyQualifiedErrorId : NativeCommandError

Please let me know if someone can help!

Thank you,
Akhil

ERROR: Unknown command '–f' looks very similar to this. Do you have –f instead of -f?

That will work, but I would suggest

date {
	match => [ "BIRTHDATE", "yyyyMMdd"]
	target => "DOB"
	remove_field => [ "BIRTHDATE" ]
}

That will leave the [BIRTHDATE] field intact if a date filter is unable to parse it. So if someone sends you dodgy data you will be able to see what is wrong with it.

1 Like

Thanks Badger! This worked and my issue is resolved. :grinning: