Need Help To Decide Num Of Shard & Replicas


(Sang Dang) #1

Hi All,
I am working on a location search system.

At first time I set up:
2 server (24 core, 64Gb Ram).
5 Primary shard & 1 replicas.
I do some term filter & numeric range filter & geo distance filter & sort
by geo distance.
Term filter will be put on one bool filter.
AndFilter will hold bool filter then other numeric range filter & geo
distance filter
But it look so slow.

I optimize it by change:

  1. 2 primary shard & 1 replicas. (my node = 2, so it's ok with 2 (node) =
    2(primary shard) + (1(replicas) +1))
  2. Wrap range filter into bool filter & enable cache.
  3. Wrap AndFilter to hold bool filter & geo distance filter
  4. Decrease merge factor to 2, increase refresh interval to 2s (default is
    1s), set cache size to 4gb.
    The result is really fine until now.

My data is tiny, from 1M-2M. And I would expect to handle upto 5M.

I would need some advice to optimize it better, should I change to use only
1 primary shard ? Should I increase replicas ? Should I do routing ?

I appreciate your help ^^

Best Regards.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e7212be8-47f3-46a6-b7c1-c43fd7859a3d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2