Multiple Indices - Index Documents with same ID - Treat as single document when aggregating

Would the following be a possibility?

Index A - Document A

{
_id: 123456789,
name: XXXX,
address: XXXX
}

Index B - Document A

{
_id: 123456789,
biography: XXXXXyyyXXXXyyyyXXXXyyyyXXXX
}

And then I do a terms aggregation against both indices on "name" where "biography" has a certain text.

Basically, is it feasible in ES to separate out large text fields into separate documents in multiple indices but treat them as the same document if they have the same ID when doing searches or running aggregations?

1 Like

No, that is not possible. What is the problem you are trying to solve?

Thanks @Christian_Dahlqvist for gettting back.

I did try and it was not possible. I was thinking if this could be a potential future consideration.

The problem I was looking to solve with this was the following scenario:

  1. I have a document mapping where many fields are basic fields with less text content.

  2. Only a couple of fields are large text fields and infrequently used for full text searches against that field and running aggregations on other fields with the full text search filter.

  3. If I could store them separately and use them only when required, the overall document size would be smaller and it would save on maintenance as the large field is always the same for that particular document. It would help with updates to the data as well as I need to only focus on a smaller document.

Not sure if this is feasible though or the negative effects it will have elsewhere.

Thanks!

If you have documents with a large component that does not change frequently and some fields that are updated more regularly I have seen a similar arrangement to what you described but using parent-child relationship within a single index. This allows you to update the documents independently and allows you to aggregate over them, although possibly with a different syntax.

Thanks @Christian_Dahlqvist.

I did look at parent-child relationships within a single index but it mentioned "The only case where the join field makes sense is if your data contains a one-to-many relationship where one entity significantly outnumbers the other entity" and also mentioned that it has significant performance impacts to your search query.

The document I have with the large text field is a one-to-one unique text for that document. So I was thinking if it would be possible to consider the above scenario for the future or if there are any outright considerable limitations to it. I wanted to make minimal changes to my queries as well if that was feasible.

Thanks!

Any other thoughts @Christian_Dahlqvist? Could this be considered as a use-case?

1 Like

@Christian_Dahlqvist or anybody else? I'd love to hear back on the feasibility of the request.

1 Like

I would say it is feasible. Using parent-child to avoid repeatedly reindexing a large document could work even though it is not the typical use case. Whether it is worth it or not I think you need to test to determine.

Thanks, @Christian_Dahlqvist.

Ok, but no way to aggregate from two indices using the same document ID? I mean I am not asking in the current version itself but potentially as a future consideration for the benefits it would have.

Or are there more technical challenges and cons than the potential benefits or is it completely not possible? Please let me know your thoughts.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.