Question on routing

Jae · November 9, 2012, 7:49pm

Now I am using elasticsearch as a realtime log analysis. Unfortunately, I
am having performance issue. To resolve this issue, I'd like to try custom
routing with timestamp because our realtime log analysis will be focused on
things such as the last 15 minutes, last 1 hour, or last 4 hours. Is it
possible sharding based on time range? If it's not supported yet, which can
be a good start to implement custom routing logic?

The second question is, currently, as I guess, elasticsearch routing logic
is gathering records with the same routing id in the same shard. If the
data has a skewed distribution on the routing field, does elasticsearch
make balanced shards across the cluster?

Thank you
Best, Jae

--

Loic_Bertron · November 9, 2012, 8:12pm

Hey Jae,

You are better to use routing in this case based on date value. You can
route at index or query time on every variable you want, just add the
routing parameter to your query like this : curl -XPUT
http://127.0.0.1:9200/index/type/id?_routing=your_value. You should create
a custom timestamp base on day date and maybe add hour if you have a lot of
logs (_routing=2012100912). All documents indexed with the same routing
value will be routed to the same shard. Use the same logic to query ES.

Even if your data is too big for one shard, Elasticsearch will spread this
shard on 2 nodes. So your query will be optimized, only querying 2 shards
and not all the shards.

--

Jae · November 9, 2012, 9:12pm

Thanks a lot!

On Fri, Nov 9, 2012 at 12:12 PM, Loïc Bertron loic.bertron@gmail.com wrote:

Hey Jae,

You are better to use routing in this case based on date value. You can
route at index or query time on every variable you want, just add the
routing parameter to your query like this : curl -XPUT
http://127.0.0.1:9200/index/type/id?_routing=your_value. You should create a
custom timestamp base on day date and maybe add hour if you have a lot of
logs (_routing=2012100912). All documents indexed with the same routing
value will be routed to the same shard. Use the same logic to query ES.

Even if your data is too big for one shard, Elasticsearch will spread this
shard on 2 nodes. So your query will be optimized, only querying 2 shards
and not all the shards.

--

--

Topic		Replies	Views
Timestamp Based Sharding/Searching Elasticsearch	1	417	July 6, 2017
[SOLVED] Customing document routing Elasticsearch	7	795	July 5, 2017
Choosing which shard a document can go to? Elasticsearch	10	2261	July 5, 2017
Time based index and Routing Elasticsearch	4	1425	July 6, 2017
Do multiple routings for a query create multiple queries? Elasticsearch	1	369	July 6, 2017

Question on routing

Related topics