Using my own document_id - is there a faster way?

This is the config from my old logstash. As I said, I have recently upgraded to 6.0 and I wanted to know if this was still the way to do it.
The answer seems to be no! I have looked at fingerprint (which came in at 5.4) and it looks the way to do it so I will change my logstash to use this new plugin.
Thanks

The fingerprint filter has been around for a very long time and is not new. As you can see from this blog post, this is still the way to do duplicate prevention.

Does this look the right way to do it?

    fingerprint {
      key => "secret"
      method => "SHA1"
      source => [ "ip", "fingerprint_sha1" ]
      concatenate_sources => true
      target => "[@metadata][fingerprint]"
    }


output {
      elasticsearch {
        user => xx
        password => xx
        action => "index"
        document_id => "%{[@metadata][fingerprint]}"

I am seeing deprecation warnings but I am not clear if they are related.

17-11-27T12:41:24,700][WARN ][o.e.d.i.m.UidFieldMapper ] Fielddata access on the _uid field is deprecated, use _id instead

This is not working as I thought it would?
From the ES index I have this:

ip               = 209.212.33.52
fingerprint_sha1 = b82d0ec6825ba8ee89220d2a609dc851a2797027
_id              = 2bd3142564d0d83691b3bb014e6404a8

But I was expecting _id to be the document_id? So shouldn't that be a SHA1 value? Effectively SHA1(ip + fingerprint_sha1)

Am I misunderstanding this completely?

That look like a SHA1 hash is used for the _id, so looks fine as far as I can see.

Over on a separate thread in Elasticsearch I discussed and implemented 2 instances per node in my cluster, along with many of the other changes discussed here.

I also changed my logstash configuration so that all data is sent to a single ES instance, and that instance can then pass the data within the cluster. Rather than flooding all 8 instances (2 per node) with data, which resulted in overloading everything.

Its all stable and loading in data in the 12,000-15,000/s range.
Thanks for all of the help and guidance.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.