Using ES as a primary data store for eBay web services


(Sebastian Herrera) #1

Hi and thanks in advance,

business case: I need to create an ecommerce frontend with ES using facets (aggregations) to replicate this functionality:

My idea is to consume ebay web services directly to ES in order to store data (ebay requirement) using ES as the primary data store for the website. Based on the fact eBay handles millions of products Im planning to use search server instead of a typical RDBMS

I have the following questions:
a- its possible to use ES as a primary data store?
b- Are Logtash or ES-HADOOP a solution to gather and include webservice data into Elasticsearch? (Solr provides Data Import Handler, but I cannot find a similar approach in ES).
c- if yes: Logtash or ES-HADOOP are fast enough to gather webservice information based on customer request in real time and display new information (including previously retrieved and indexed information) on frontend instantly?.

Thank you for any advice, im new in this area. brgds
Sebastian


(Magnus Bäck) #2

a. Yes

b. I'm not familiar with ES-Hadoop, but Logstash can help you with this. For more complex cases you may need to write your own import glue.

c. What do you mean by "on customer request in real time"? If the primary data storage is elsewhere you'd typically run a periodic import of everything that's new since last time, or if possibly implement streamed updates where the primary datastore feeds changes as they arrive. But are you talking about search requests from an end user triggering e.g. Logstash to fetch data to ES and then perform a search against that data?


(Sebastian Herrera) #3

Thank you Magnus Back,

c- yes, an example where users perform a search request and the application trigger e.g. Logstash to fetch data to ES and then perform a search against that data?,, my main question is if real time indexing is really fast to display results in (sub) seconds.

brgds!.


(Magnus Bäck) #4

c- yes, an example where users perform a search request and the application trigger e.g. Logstash to fetch data to ES and then perform a search against that data?,, my main question is if real time indexing is really fast to display results in (sub) seconds.

No, that doesn't really make sense and you probably won't get sufficiently good performance out of it.


(Nik Everett) #5

Elasticsearch usually refreshes the index every second. This is configurable and one of the biggest factors in bulk loading performance. You can make it smaller or even refresh on every update but that tends to cause bad, bad performance so its not a good idea. The usual way to handle this kind of UI consistency issue with Elasticsearch is to push it to the client. So after the client made some change to inventory they could click a button that retried the search for the change every few milliseconds until the new version of the thing item becomes visible in the search index.


(Sebastian Herrera) #6

thank you magnus!.


(Sebastian Herrera) #7

thanks nik


(system) #8