Microservices: Decouple sorting from filtering/aggregations

Context: ecommerce category page

We are using Elasticsearch for our product index.
On a category page

  • the users can filter products,
  • we show the number of products in each category
  • we sort the products using Elasticsearch

We are moving to (micro)services. The teams should be as autonomous and independent as possible.
Idea: one team owns the category page, but another team optimizes the sort order.

Challenge:
If both teams work in a single Elasticsearch index and the same query has to filter/aggregate/sort, there is strong coupling between the two teams.

Question:
How might we decouple the sorting from the rest of the category page?
How are other companies doing this?
Does Elasticsearch have any capabilities that could help?

Thoughts:
One approach might be to build two fully independent systems that don't use the same Elasticsearch but integrate via APIs. So one system might prepare the category page including filtering and aggregations, and then ask a second system for products in the right order.

But that does not appear easy.
The second system would need the same data and it would need to respect the selected category and applied filters – it might duplicate a lot of what the first system does. And when you introduce a new filter, the second system might have to change, too?

Or the first system would have to pass all (filtered) products to the second system to avoid having to duplicate the filtering. But then the first system might have to send thousands of product IDs to be sorted for every page view – that does not scale.

Have you considered sharing the filters properly among microservices and then run one aggregation query and one product query? That said, those running at different times means, that they can yield different counts at results.

In general this seems to be less of a data retrieval issue, but more of an organization issue in terms of conways law. You will always find a solution to this (i.e. enablined caching, writing same queries), but it will never reflect the simplicity (and probably performance due to only running once) of your initial single query.

We have not considered any options thoroughly yet – we are only starting to extract the services. I am trying to get an overview of the options and trade-offs – and to uncover some unknowns (from lacking Elasticsearch expertise).

Our organizational issue is that we want to work in that area with many teams. So we need to split the system very granularly. Multiple things that currently work from one system with one Elasticsearch will have to be done by different teams / systems in the future.

Our first challenge is sorting, but in the future we may also want to have a team dedicated to search – then this question will come up again. You would have one system that can do German language search and display the results – but you still want to be able to filter on the page; or navigate to sub-categories.

It may be an option to "share" filters and categories in a way so that each system can use them. But it's not easy and no super-clean decoupling of the systems. The category navigation and the filters are a big part of the category system – sharing them with other systems also introduces a coupling/dependency.

For sorting, we considered an option where the sorting system might only create scores for products and then the category system would do the actual sorting using the scores. But that would not be an option for search – you can't precompute the scores for any possible search term.