I have read a lot on the forums for pros and cons of multiple indexs vs single index and I want to get some advice for my specific use case. I have a pipeline that ingests word documents from clients, and afterwards runs a query against it. The query and the documents do not change so in normal circumstances only one query is executed and the results are expected to be the same. Is it better for me to have
- single index for multiple customers and all of their documents
- single index for each customer and all of their documents
- single index for each customer and single index for a set of documents (for one query)
- single index for every set of documents but dropped after query is completed
Thank you for your advice.
The answer is It Depends. Do you do specific retention periods for different customers? Or do you want to be able to do billing based on resource use (eg disk)? Do you have security and privacy requirements for some customers?
If so, then splitting by customer might make a lot of sense.
Thanks for sharing. So here's my thought process:
- Customer uploads documents and I index them
- After index, I run a query against the documents
- Since I don't need to rerun a query, I save the query results in another nosql database like mongo
- I drop the one-time use index
- In future, I retrieve the search results from mongo
Is this an efficient way to handle such "one time use" indexes?
Elasticsearch isn't really designed for this sort of use case, it'll work though.
What is it you are trying to achieve here? What is the use case? Why use Elasticsearch at all?
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.