Hi,
I have two data sources - Global (14M records) & customer specific (5M records)
Data -  Products with fields - name, description, manufacturer part number (MPN), vendor sku
Some search query.
Output -
Results grouped over MPN with relevancy (most matching doc score decide relevancy of group) from all data sources with deduplication across pages and pagination over grouped results.
Can I use Collapse on MPN with my search query on Unified Data model (differentiating docs by source_type ) here ? with nearly 22M records for fast retrieval.
Is there any other way to do it ?
I can’t pre group global data with customer specific data on MPN need that separation as there can be n customers and single doc can be very large.
Is there any other way to avoid grouping over Global Data source docs as its huge data. Some other ways with queries?
Unified data models -
Product from Global data source (no company_id)
{"mpn": "DELL-G7GV0","company_id": null,"source_type": "icecat","source_id": "80076143","product_name": "DELL G7GV0","description": "DELL G7GV0. Brand compatibility: Dell","manufacturer": "DELL","vendor_sku":"244545"}
Customer specific data
{"mpn": "DELL-G7GV0","company_id": "f1eb99a720ef4f05bfa6beae16aa235c","source_type": "custom_distributor","source_id": "34565434","product_name": "DELL G7GV0","description": "DELL G7GV0. Brand compatibility: Dell","manufacturer": "DELL","vendor_sku":"2345654"}  
 so you have already limited your chance of success....
 so you have already limited your chance of success....