Using elasticsearch percolate + N indices as a feed system

Hey all,

We are considering building a fan-out feed inbox type system on top of ES
for Reverb.com. The way it would work is each user can follow some number
of searches. Using the percolator, we would plop new items as they matched
searches into individual user feeds. We are going to have the problem of
many users following a common search such as "all electric guitars" so
fanouts are going to get potentially expensive.

We're thinking of structuring it so that each user has an index of items
that just contains the item id's and then query against that index and then
again against the main item index to get the details.

I would love some feedback on some of these ideas in terms of feasibility
in ES. We like the idea of building on top of ES to avoid introducing
additional infrastructure or other databases as ES is our primary search
engine.

A couple questions

  1. Is this a sane design? Any obvious flaws?
  2. We will need to prune items that sell out of everyone's indices. Is the
    delete by query API expensive if we have to run a query such as "delete
    product from these 1M indices", having an index per user.
  3. Does percolation get expensive if we have a million users subscribed to
    500k different searches.
  4. We could potentially shard the cluster by user id so that users 1-100k
    lived on one cluster, etc.

Your help is greatly appreciated

thanks,

Yan Pritzker

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/33feaea6-d59b-4814-9c69-6c9f85408076%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.