Search to treat multiple documents as one

eracac · July 25, 2016, 5:09pm

Hi,
Some of the documents that we need to index are very large (can possibly be up to 2 gigs). Is there a recommendation on what is the best way to index these type of documents so that they are still easily searchable as a unit? So far, what I've done is split each large document into smaller documents with an id field that can be used to tell if it came from same original document. However, this means that for queries with "AND" type of tokens, I have to split each into separate queries and then do some type of group by the id field to get the match to the actual entire document. Is there a better way to do this?

Thanks!

warkolm · July 26, 2016, 10:00am

Maybe parent/child? Where the parent contains meta-info about the doc, and you have children as chapters/pages/whatever.

Topic		Replies	Views
How to deal with splitted docs? Elasticsearch	1	346	March 10, 2020
Multiple Indices - Index Documents with same ID - Treat as single document when aggregating Elasticsearch	9	4348	October 16, 2019
Single file indexing with multiple docs in es Elasticsearch	1	127	December 7, 2023
Proper way to index document with many child documents? Elasticsearch	2	423	July 12, 2018
Indexing long documents in chunks Elasticsearch	3	2934	July 5, 2017

Search to treat multiple documents as one

Related topics