Multiple Indices - Index Documents with same ID - Treat as single document when aggregating

gnukev · August 26, 2019, 11:39am

Would the following be a possibility?

Index A - Document A

{
_id: 123456789,
name: XXXX,
address: XXXX
}

Index B - Document A

{
_id: 123456789,
biography: XXXXXyyyXXXXyyyyXXXXyyyyXXXX
}

And then I do a terms aggregation against both indices on "name" where "biography" has a certain text.

Basically, is it feasible in ES to separate out large text fields into separate documents in multiple indices but treat them as the same document if they have the same ID when doing searches or running aggregations?

Christian_Dahlqvist · August 26, 2019, 11:56am

No, that is not possible. What is the problem you are trying to solve?

gnukev · August 26, 2019, 12:56pm

Thanks @Christian_Dahlqvist for gettting back.

I did try and it was not possible. I was thinking if this could be a potential future consideration.

The problem I was looking to solve with this was the following scenario:

I have a document mapping where many fields are basic fields with less text content.
Only a couple of fields are large text fields and infrequently used for full text searches against that field and running aggregations on other fields with the full text search filter.
If I could store them separately and use them only when required, the overall document size would be smaller and it would save on maintenance as the large field is always the same for that particular document. It would help with updates to the data as well as I need to only focus on a smaller document.

Not sure if this is feasible though or the negative effects it will have elsewhere.

Thanks!

Christian_Dahlqvist · August 26, 2019, 1:03pm

If you have documents with a large component that does not change frequently and some fields that are updated more regularly I have seen a similar arrangement to what you described but using parent-child relationship within a single index. This allows you to update the documents independently and allows you to aggregate over them, although possibly with a different syntax.

gnukev · August 26, 2019, 1:17pm

Thanks @Christian_Dahlqvist.

I did look at parent-child relationships within a single index but it mentioned "The only case where the join field makes sense is if your data contains a one-to-many relationship where one entity significantly outnumbers the other entity" and also mentioned that it has significant performance impacts to your search query.

The document I have with the large text field is a one-to-one unique text for that document. So I was thinking if it would be possible to consider the above scenario for the future or if there are any outright considerable limitations to it. I wanted to make minimal changes to my queries as well if that was feasible.

Thanks!

gnukev · August 27, 2019, 9:33am

Any other thoughts @Christian_Dahlqvist? Could this be considered as a use-case?

gnukev · September 17, 2019, 9:56am

@Christian_Dahlqvist or anybody else? I'd love to hear back on the feasibility of the request.

Christian_Dahlqvist · September 17, 2019, 11:48am

I would say it is feasible. Using parent-child to avoid repeatedly reindexing a large document could work even though it is not the typical use case. Whether it is worth it or not I think you need to test to determine.

gnukev · September 18, 2019, 1:08pm

Thanks, @Christian_Dahlqvist.

Ok, but no way to aggregate from two indices using the same document ID? I mean I am not asking in the current version itself but potentially as a future consideration for the benefits it would have.

Or are there more technical challenges and cons than the potential benefits or is it completely not possible? Please let me know your thoughts.

system · October 16, 2019, 1:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Search to treat multiple documents as one Elasticsearch	2	663	July 5, 2017
How to INNER JOIN documents from different indexes Elasticsearch	7	6245	August 17, 2017
Inter index operation Elasticsearch	8	1005	April 7, 2018
Index duplicate documents with the different routing id Elasticsearch	7	1481	January 15, 2021
Multi parent child relation Elasticsearch	3	1197	July 5, 2017

Multiple Indices - Index Documents with same ID - Treat as single document when aggregating

Index A - Document A

Index B - Document A

Related topics