Ingesting node

I used to use Logstash to ingest data to Elasticsearch, now I plan to change it to logstash, but there is one functionality I have to rely on, I need to generate a hash to uniquely identify a document

This is the logstash code I have, I how can I have the same functionality in ingesting processor as the following code in logstash

        ruby {
           code => "require 'digest/md5';
                    event['@metadata']['computed_id'] = Digest::MD5.hexdigest(event['filed1'].to_s + event['field2'].to_s + event['record_time'].to_s  );
                    end "
        }

I have two questions

  1. Is there any way to have external md5 library to generate hash code based on some fields values?
  2. In ingesting node, how do I set ['@metadata']['computed_id'] for elasticsearch to pick up the document id for each document

Oddly, I was fiddling around this past weekend with the ingest node and made a processor to do this. It does MD5, SHA1, and SHA256. https://github.com/eskibars/elasticsearch-ingest-hashfields . It only works on string data currently, but should be easy enough to extend if you need others.

Thank you so much for your answer.
I also need to make this hash as _id field. Can you provide an example how to do it?
_id field is important for duplication in our elasticsearch index, it makes sure we do not have repeated documents.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.