Nested Vector Search -> Retrieve k chunks

anuj99 · April 12, 2024, 5:11pm

Following is my use case to store data in Elasticsearch for workspace search connecting different data sources.

Text from file is chunked and stored in different document used to vector and keyword search chunk.
However, each file has set of allowed users as well as allowed groups who can access the document. Users can belong to group and used to search based on access. I want to support both keywork and semantic search.
I want to avoid permissions duplicacy for each text chunk.
What's the best way in order to index such data so that filtering also becomes easy. I want to filter the data while querying instead of applying pre filter/ post filter.
For access based control, each document can have list of allowed user, allowed group, or all user can have permission to it.

Consider I have millions of files and their permissions to index, for example google drive of an organisation using service accounts, what should be ideal data storage strategy, optimal search.

Considering I decide to store a single document per file with nested vector search, I have a use case of finding top-k passages irrespective of the top level document.

Even if the top 2 passages are from same document, then both of those passages should be returned.

Sean_Story · April 15, 2024, 1:52pm

I believe you've already asked this question here: Elastic search document access based control and vector search

Please refrain from opening duplicate discuss threads, and be patient as we work to answer your old post. The discuss forums are monitored by our engineers during free cycles, and don't have a guaranteed response timeframe. If you need faster replies, consider our support or consulting services.

Topic		Replies	Views
Elastic search document access based control and vector search Elasticsearch vector-search	12	780	April 16, 2024
Dense search for large documents Elasticsearch vector-search	5	183	January 10, 2024
ES permission filter Elasticsearch elastic-stack-security	0	98	April 15, 2024
Retrieving top N hits from nested documents across all matching documents Elasticsearch vector-search	1	438	June 12, 2023
Vector Search in ElasticSearch + Filtering using other fields Elasticsearch vector-search	4	408	May 7, 2024

Nested Vector Search -> Retrieve k chunks

Related topics