Elasticsearch Solution Needed for Large User-Specific Document Searches

Hi,

We are facing a challenge with Elasticsearch indexing and searching in specific, with App Search.

Here's the situation:

We have an index containing books, and users who can search only through the books they own. Some users can own up to 100,000+ books, which makes using filters impractical due to their limits.

We considered adding a field to each book document containing the userId of each user who has purchased it. However, updating this field is too slow, especially since each user can have over 100,000 books. Additionally, there are user groups with different permissions, and the group owner can control which users have access to which books.

Another idea was to create a separate Elasticsearch index for users, which would include the books relevant to each user. However, with the App Search filter limit of 1,024, this approach isn't feasible.

Document-level security was also considered, but it appears to be restricted to API keys, and App Search doesn't seem to support it. While App Search does allow adding users, managing 100,000+ users this way is not practical.

We’re currently stuck and seeking a viable solution.

Any help would be appreciated!

Kind regards,
Chenko

Document-level security was also considered, but it appears to be restricted to API keys, and App Search doesn't seem to support it

That's not quite accurate. See an example: Leverage DLS from connectors in App Search | Enterprise Search documentation [8.14] | Elastic

However, this doesn't scale great, as you'd still have to manage 100000+ signed search keys.

We are facing a challenge with Elasticsearch indexing and searching in specific, with App Search.

The issue that you're really pushing up against is that App Search wasn't primarily designed to provide disparate search results for different users. Its primary use case is public search. Which isn't to say that other use cases aren't possible, just that they don't have the same emphasis.

Other (not great) ideas you could investigate:

  1. one engine per user (I don't expect this would scale well)
  2. Multiple requests to fetch a users's books (each User is a document, with a list of book ids. Fetch the user, then in batches fetch all their books, then sort in memory for relevance)
  3. Abandon App Search and do this with Elasticsearch. You can use the App Search Explain and App Search Elasticsearch Search APIs to help ease that transition, but you'd have to do a lot more low-level management of mappings and query tuning.
1 Like

@Chenko you also may want to consider Consulting Services for the Elastic Stack | Elastic Consulting

I notice you're quite active in our forums with some very broad topics. If you find you need more dedicated help working through optimizing your architecture and/or products consulting might be a better and more consistent fit.

Hi Sean, Thanks for the swift response!

I did not read that yet, thanks for letting me know!

Indeed, that would not scale great.

Indeed, I agree however our client does have a public set of data that is made for public search which uses app search, now they do want to have the same kind of filtering on user's own documents. so for this they also wanted app search, so if they tweak the relevance tuning in the public search, they can also tweak it in the users private documents to match the public search.

Indeed I also thought this would not scale well but have not actually tested anything like it before, my first thought is that this would be alot of data that needs to be stored.

This could be done but then we would be missing out on App Search's relevance tuning for this, and possible also curations.

This was also a thought of mine, however like you said it might be messy with alot of low level management and the query tuning. That being said, even if we do this how would be allow users to only access their own documents (In ES)?

Thanks for the suggestion, but I am actually an ES consultant myself.

I thought it would be great to ask questions here that I am struggling with. So I can help both myself and others in the future.

Kr, Chenko

Hi there @Chenko - @Sean_Story has some good suggestions about how to make this work with App Search, but I think his suggestion about using Elasticsearch APIs over App Search is probably the best solution in terms of scalability and versatility.

If you want to leverage some of the transparency of App Search you could look into a couple of things:

  • Consider storing your advanced query in a search application or a search template. You could even start with an App Search engine here, and use the Elasticsearch query that App Search generates as input into your template.
  • Use query rules instead of curations
  • Use our synonyms API for synonyms management

Then you'd have the flexibility of the entire Elasticsearch query DSL for your niche query requirements.

Also, yes, if you're running into these questions I'm sure they're very useful to other users so thank you!

1 Like

Thanks, both, for the suggestions. We have currently parked the issue.