Elasticsearch vs MySQL for large data set using exact value searching

(Ian) #1

I'm evaluating using Elasticsearch vs sharding (or partitioning) MySQL for a record set of around 2.5 billion records of frequently accessed data. The data is simple in the format of (customer_id, product_id, qty, price, and a few other miscellaneous columns). The miscellaneous columns will never be searched against, they are just data needed.

The queries will always be in the form of WHERE customer_id =? AND product_id IN (?) AND qty=?, perhaps occasional WHERE customer_id=? or WHERE product_id=?. For MySQL, these are easily indexed. I've already done some testing and on first load, it is around 0.8 seconds for around 1.1 billion rows and the second load is in the 0.05-second range. In the final solution, this would be sharded over multiple DBs or partitioned over multiple tables so I would expect faster results in either case.

Would elasticsearch perform better on the first pass for this amount of data given the type of searching we are doing? We have a large budget so we can cluster, shard, or get a big server for partitioning extensively in either solution.

Thank you.

(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.