Index-Per-User Scale

Hi Everyone,

Thanks for advance for reading and any answers! We're considering using ES with an index-per-user design, and had some specific questions about how it scales. I've read up on the subject, and a common suggestion for this scenario is using a shared index with a user-id filter; however I think we have a strong reason to not use that design (reason below, feedback welcome).

Why we prefer index-per-user over shared index:
We often need to fetch all of the documents for a user. If we're using a shared index, we can filter by user-id, but it still means making N disk reads since that user's documents are spread across a large index (a 10k document user would take over 1 second of IO time). If we use "index-per-user", the entire index is only 2-3MB for 10k documents (our documents are small, primarily non-analyzed fields). We can load the entire index in a few disk reads (AWS/EBS block size is 256kb). With index-per-user, we can load about 500 users per second with the same disk speed, which is ~833 times faster than when we use a shared index (actual performance varies, but its much much faster in an index per user design). Caching at the page cache level is also greatly improved (since many documents from a user fit into a single page).

Technical details:

  • over 1M users. No requirement to search across all of them.
  • only a small number of users are in use at a given time (~2000). Most are cold and could be closed using the ES close API (we're happy to manage open/close calls ourselves)
  • 500-500k documents per user (99% under 25k). Documents are quite small and follow a consistent schema/mapping. As a result, 99% of indexes would be under 5MB.
  • we also want to do some aggregation, which will be fast on the smaller indexes, but slows down on the larger shared indexes (still benchmarking)

Question:

  • How much memory does each index take in the cluster state. Does it grow linearly?
  • If an index is closed, how much memory does each index take in the cluster state?
  • Is it crazy to attempt a cluster with >1M indexes (mostly closed) and ~5k open indexes? If so, please explain why.
  • Any gotchas we should be aware of if considering launching "index-per-user" at this scale?

We'll also be benchmarking the above, but I thought I'd ask and get some feedback in case we're missing something.

Thanks in advance,
Steve

This might be useful to you:

Its obviously not done, but it could help fix these locality issues.

I don't have a good idea how much memory each index takes but it certainly does grow linearly. Some of the

You'll probably be the only person doing it so you are likely to hit new and fun bugs no one has thought of. There are folks who have lots of always open indexes - I have two thousand or so - and I can feel cluster state actions being a bit sluggish. I'm still on an old version of Elasticsearch at the moment and I know that some of that has been fixed but I still expect the scale of what you are describing to be "fun".

I think its worth writing a test application that pretends to do what you are going to do - it could be as simple as a bash script that creates you million indexes, jams 20 docs in each, and closes them. Measuring memory consumption in Java is hard but you should be able to try and simulate your task and test it. I'd advise against doing what you plan to do without first proving it out that way.

Or go comment on the issue I linked above and maybe pitch in there somehow.

Thanks nik! The ordering/sorting would certainly be perfect. I wish it was available today. If we decide to use ES we'll try to contribute.

We'll benchmark and I'll post the results here. However, I'm not too keen on wandering into unexplored territory.

If anyone else has experience with this scale, please let me know.

Thanks,
Steve

Agreed with what Nik said. Index per user is something which is not advisable when u are thinking of customer base more than say 10K. While u get isolation with per-index, there is cost attached to it - cluster state grows, more shards, more reallocation/balancing, more load on master, more index stats captured by marvel,... and list continues. At the same time 1 index for all user is also not recommended. You have to come up with a reasonable shared index model.

Another thing you need to consider is how consistent the mappings are across your users documents. If there is little structure and a lot of new mappings need to be added when the documents are indexed, each added mapping or index need to update the cluster state, which then need to be distributed across the cluster. This can easily become a bottleneck once the cluster state grows large. This may improve once the ability to send cluster state deltas instead of the full cluster state for each change gets released.

If you can control the mappings used across the user base or can impose a naming convention across field names and types, you should be able to achieve some of the efficiency improvements by using routing by user to ensure that all documents belonging to a specific user are located in a single, although shared, shard.

Thanks everyone.

Given our use case really needs disk/page isolation we'll probably shy away from ES. It's too bad - it looks awesome.

Thanks for the help.
Steve