Hashing in ElasticSearch

Hello Team,

We have a requirement to store an array of 100,000 users in an Elasticsearch field. During search, we need to match if a user exists in that array and return the corresponding document.

Is it possible to achieve this using hashing or any other approach?

Why not using just the standard search on text?

But yes you can compute a fingerprint with: Fingerprint processor | Elasticsearch Guide [8.8] | Elastic

Hi @dadoonet,

Thanks, We are using elastic5.6. Is there a way in elastic 5.6 version.

You will have to do this with Logstash or custom code then.

Also note that 5.X is very much EOL and no longer supported, you should be looking to upgrade as a matter of urgency.

It sounds like this is a user persmissions field that you filter on. If this is the case I also assume you will be adding and/or removing users on a regular basis.

If this is the case you should be aware that updating very large documents (which this could be) is expensive. One way to handle this that I have seen in the past is to use a parent-child relationship where the parent is the document and the child is the permitted users. This does complicate querying as you would need to add an has child query to every query you run, which would have a performance impact. You would need to benchmark to see the impact, but note that having single very large documents also can have negative performance side effects. Having potentially 100000 child objects may not be optimal, so a workaround could be to create a set number of child objects and hash users into these. If you have 100 child objects per document each could hold an array of 1000 users. This would result in smaller documents that are more efficient to update and modify.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.