Design Decision: Serve Search Result from ES or cached DB

Andreas_Weber · February 19, 2017, 9:32am

Hi,

we are developing a new service right now, which essentially represents a catalogue for online marketing purpose.

The catalogue consists of 4 Million highly structured and individual customer specified products.
The catalogue can be searched with elasticsearch with about 50 search filters including aggregations and histograms.

From a design point of view, we see two options right now:

Serving the search results completely from the ES index (incl. caching) OR
Only retrieving the IDs of the documents from the ES index and serving the documents representing the search results from our cache supported database (Hazelcast => Hibernate => Oracle).

Right now we would prefere (1.) with a future option of (2.).
Is (1.) a valid option concerning scalibility and throughput?

Cheers
Andreas

dadoonet · February 19, 2017, 10:12am

I prefer option 1 but if you want get back managed entities, 2 is better. Good news: hibernate search now supports elasticsearch so it can be easy to implement option 2.

Still, if you just want to display results to the user, I'd go to option 1.

jprante · February 19, 2017, 10:47am

Keep it simple.

It's not a design decision but a cost-benefit-ratio decision. You have to do it for yourself.

To find out the ratio, execute a performance benchmark on option 1 und 2, and measure much throughput you can achieve. Take the future growth of your requirements into account (You want scalability but do not give any estimated target)

Then add the people and effort (licenses, staff, machines) you need to maintain in option 1 (ES) versus option 2 (ES/Hazelcast/Hibernate/Oracle stack) and write down the costs. Here, you have also take the future growth of costs into account.

If the benefit of performance is so high that it justifies the cost, go for it.

Andreas_Weber · February 19, 2017, 11:01am

We are aware of the Option (2.) scenario performance: As it utilizes a local memory and filesystem cache, it is fast and will scale (independent of ES). For showing single products, we will have that implementation in any case.

As far as I understood option (1.) "should" work.

@jprante: The idea of benchmarking option (1.) and having a look when it breaks is indeed a very good idea.

system · March 19, 2017, 11:01am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performance as a sql result cache Elasticsearch	4	380	July 6, 2017
Pyes related question : performance related Elasticsearch	13	546	July 6, 2017
ElasticSearch as a searchable cache Elasticsearch	7	411	July 6, 2017
Buffering Elasticsearch results in (non)sql database Elasticsearch	4	433	July 5, 2017
ES use cases Elasticsearch	9	339	July 6, 2017

Design Decision: Serve Search Result from ES or cached DB

Related topics