Search with _routing on indices without _routing


We have time-based data in Elasticsearch cluster: one index per month and use alias to join them. So the user only need to search on the index alias.

In the past when creating the index for each month, we didn't specify the _routing parameter. Now we realized the search performance issue and plan to use a customized _routing parameter for new index to be created going forward.

If we do this, the alias will include both indices with & without _routing parameter. Can we add _routing parameter to the Search API in this case? Will the Search return the correct result?

I don't think we can re-create the old index with _routing parameter because they're very large.

Any suggestions?

How many primary shard do your indices have? How many indices do you have? What is driving the interest in using routing?

There are 12 shards + 1 replica per index. We keep 5-years of data. Since there is one index per month, so it means each alias corresponds to 12 x 5 = 60 indices.

Our user's search request always filters on one field warehouse_id. I want to set _routing on this field. Am I right on assuming this setting can improve the search performance?

Yes, it might.

What is the cardinality of the warehouse_id field?

The cardinality of warehouse_id is low: around 500 unique warehouse_id for one index with 600K documents.

In order to get the full benefit from routing I suspect you need to reindex the existing data as well as you otherwise probably end up with incorrect results. You may however start indexing using routing without using it when querying. This means all queries will target all shards, but for indices created with routing only one of the shards will ever generate any results, which over time may give a performance boast on its own.

If I use search API without _routing parameter against an index created with _routing parameter, can I still see the performance boost? Why? My understanding is this search will go to all shards, not one shard.

I am speculating that you may see some benefits, but to be sure I would recommend you test. Reindex one of your mixed indices into a new index with the same number of shards and use routing. The run the same queries against both without routing and see if there is any difference.

reindexing my old indices is another pain. I'm using Elasticsearch 1.7 so there is no REINDEX API. Even this is available, I can't imagine how long it will take to make a copy of 600K documents.

Then I would warmly recommend upgrading as well... :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.