Routing in Elasticsearch


(Rajagopal Sathyamurthi) #1

I have a question about how to best perform routing. We have three types of
data that we are indexing -- entities, user_entity_relationships, and
user_entity_analytics. This is mapped as a parent, child, grand child
relationship. Entities being the parent - which can have multiple
user_entity_relationships as children (each user_entity_relationship
belongs to one user) and user_entity_analytics being the grandchild. An
entity can have multiple user_entity_relationships as children for each
user it belongs to. User_entity_relationship can have multiple
user_entity_analytics as children - one for each day.

I've read about the importance of custom routing scheme and the general
recommendation seems to be to use user based routing scheme. However, in
our specific case, user based routing doesn't work because entities (the
parent) can belong to multiple users. By default, the child gets routed to
the same shard as the parent. But in our case, we have one or more children
for each parent and the user information is bound to the child and not to
the parent.

Some common operations that we need to perform are

  1. Searching on all entities belonging to a specific user over a certain
    date range (the date is stored inside entity)

  2. Perform aggregates on entity analytics for a specific user over a date
    range,

  3. perform aggregates on entity data for a user over a date range.

It would be great is anyone can advise me on best practices for indexing as
well as routing this kind of data.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1d40fe50-4e7f-4687-913f-1219e91b8b09%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2