Hi there
I hope someone can give me some advice for this.
I have a portion of ruby code that makes Logstash looping (CPU = 100 %, no more indexing in Elasticsearch).
I try to index about 400 000 events and I see the indexing throuput gradually getting down (about half events eventually indexed)
I thought the loop was inside the code so I added a new field "elapsed_time_in_Ruby". And It appears the loop is not inside Ruby itself
Any idea would be very much appreciated
Logstash 2.3.3
`The loop appears when I add the following lines of code :
(trying to catch couples of (key, value) with a separator ( between key and value and at least to white spaces between couples. For example :
my key1 : value 1 my key 2 : value 2 my key 3 : value 3
expected result : (my_key1, value 1), (my_key_2, value 2), (my_key_3, value 3)
# Fonction permettant de récupérer un nom de clé nettoyé et d'autre part la string d'entrée amputée de sa première partie (la clé et le ':" remplacés par |)
def getonekey(s)
# la clé est constituée de chaînes de caractères (\S+) séparées par au plus un espace et 'à gauche' du séparateur (:). Voir http://rubular.com/ pour les regexp Ruby
pattern = '\S+(\s?\S+)+(\s)*:'
# le match renvoie la clé dans le premier élément
key = s.match(pattern)[0]
# remplacement de la clé complète par un pipe
s = s.gsub(key,'|')
# cleansing de la clé
key = key.gsub(':','').strip.tr(' ','_').tr('.','_').downcase
#
return[s,key]
end
.... (ruby code before (ok))
if (nbre_separateurs > 1)
patternko1 = ':\s*:'
if (v.match(patternko1).to_s.empty?)
s = v
keylist = Array.new
begin
for i in (1..nbre_separateurs) do
res = getonekey(s)
s = res[0]
keylist.push(res[1])
end
vallist = s.split('|')
for i in (1..nbre_separateurs) do
if (event.to_hash.include?(keylist[i-1]))
keylist[i-1].concat('_').concat(column_number.to_s)
end
if (!vallist[i].to_s.empty?)
event[keylist[i-1]] = vallist[i].strip
end
end
end
else
(event['tags'] ||= []) << 'pb parsing patternko1 ligne '.concat(column_number.to_s)
end
`
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.