Implement Dynamic Hash Table?

Firstly, both dns and jdbc_streaming filters can implement time-based caching, so the cost of these may not be as bad as you think.

Secondly, a memcached filter require a memcached instance running somewhere external to logstash. You could do it, but that means you are adding a network call for each event. Furthermore, there are numerous race conditions that mean multiple threads may 'set' the memcached entry for the same person at about the same time.

I would at least try implementing this using aggregate using push_map_as_event_on_timeout. Make sure you disable java_execution in addition to setting pipeline.workers to 1. Yes, that will limit throughput.

Configure the aggregate filter with [Person] as the task_id. Set the timeout_tags to [ "deleteMe" ] and add

if "deleteMe" in [tags] { drop {} }

to drop the events created as entries are purged from the cache of recently seen names.

Then in the code option do something like this (I have not tested it)

code => '
    if map["justSawThisPerson"]
           event.set("[@metadata][justSawThisPerson]", true)
    end
    map["justSawThisPerson"] = true
'

then finally make applying all the expensive filters conditional upon

if ! [@metadata][justSawThisPerson] {