Need-Help-On-elasticsearch in complex scenario

Hi team, I need your help with a very complex scenario.

We have a user management system, which has a user index, which stores various attributes of user, and a user action log index, which records what the user has done at what time, the user action log index has a userId field.

Our requirement is to be able to filter out specific users based on multiple attributes and aggregate their actions over a time horizon: for example, how many emails and how many phone calls they made in last month.

Considering our time horizon can be one day, one month, or one year, the index of our operation records is cut by month like [user-log-202201, user-log-202202, ...].

The problem is that the number of users we filter out based on specified attributes can be very, very large (sometimes even hundreds of thousands), and it would be very slow to query the records of these userId using term Query in the user action log index.

We have a solution: we merge these two indices, so we only need to query log records by multi user attributes, but this poses another problem: we could update, delete or add a attribute for user at any time. after merge, updates a single attribute for one user, is equivalent to update all log records of this user, this also seems to have an impact on performance.

Is there a better solution?

No, not that I am aware of. The second solution you described is what is generally recommended if the first does not scale (or if you do not want to partition the second query phase). Performing a lot of extensive updates can have an imapct on performance but this approach is often the best tradeoff assuming updates are reasonably infrequent.

If you are going with the second approach, make sure you do not forcemerge older indices down to a single segment as this can appect merging in order to remove updated or deleted documents.

OK, Thank you for your answer.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.