I might be overcomplicating this (most likely!).
I have a huge data set.
Within that data set there are 'organisations' these appear nested within the main record as an array, as there will be multiple of these based on a history - think resume.
What I'd like to do - is leverage these 'organizations' and create an entirely new search interface that allows users to search specifically for these organizations, based on the nested parameters within the organization object, such as size, year, industry.
Some challenges I'm facing -
- I need to avoid 'duplicates' of organizations returning in search results
- the organizations are nested within an array, which is nested within the full record. So aggregation would need to take into account these to avoid duplicates.
- There are only currently 11 fields within the organization object, fortunately - so it's not super large, but the overall document has 200 +.
I know it would be simpler to just extract the organizations, do some work with this data and then upload to a second separate instance of ES, but for now I'm looking to keep costs down and use the same dataset, if possible.
If this is a crazy way to do it - let me know and I'll bite the bullet and look into a second instance.