We have a web service that allows a ~10 customers to look at their data. Each customer has about 1M - 10M records. Our total datastore size will target the 10's of gigabytes. We currently store this data in postgres normalized with indexes depending on the queries. The fields that we need to search are 80%
keywords and must be exact or can be ranged upon, and 20% are titles that have a few english words each. Most searches will be 1 or 2 filters activated for keywords. In other words, searches involve very simple filters, not complex queries. We need these searches to be completed within a couple seconds ideally, to provide for a responsive web application. We don't need fuzzy matching, but some basic text parsing (lowercasing, removing punctuation, etc...) will be useful.
- Should we adopt elasticsearch for our needs on top of postgres? Postgres must stay for other reasons.
- We require consistency in our search results. In other words, if an item is updated while a search filter is active, the search filter must reflect changes immediately. For example, if a title is being searched for within a location, and the title changes, an active filter for that title must reflect that the title is removed. My assumption is that we would need to write our updates to elasticsearch and wait for confirmation. However, the changes also need to be reflected in postgres. Doing dual writes could be a complex solution. It would be difficult to manage if we also need to keep elasticsearch in sync with postgres. Is this a valid reason to avoid adding another datastore?
- Our QPS should be quite small, but our search space can range to the millions of records. My biggest concern is whether postgres can handle the load to service searches within a couple seconds.
- With only a few words per title, do we need support for TF/IDF in elasticsearch?