Search over large documents

makitka2007 · February 10, 2019, 8:09pm

hi all

we have an index in ES which we use both for search and display on UI for resource card. this card is shown for user when he clicks on a title from search results, and document is get there by it's id. so, about 10% fields are indexed, other 90% are just stored in index as source with "index: false".

does it make sense to split this data into 2 separate indexes - so, first one will contain only fields we need for search/filter/aggregate and display on search results page, and other will contain all remaining fields? will it improve search performance? or it doesn't matter a lot how big is document, because anyway ES uses inverted index for indexed fields and just read source documents then for found ones? will benefit of such splitting be significant here?

thanks in advance for answer

warkolm · February 11, 2019, 1:11am

It'll make a difference when it needs to do the fetch phase of a search. The overview here is for an older Elasticsearch version, but it's still valid.

makitka2007 · February 11, 2019, 8:22am

so, on search phase performance will be quite the same for both cases, right?

Christian_Dahlqvist · February 11, 2019, 8:31am

I suspect keeping the data in a single index/document will be faster as there will be fewer queries and disk seeks to retrieve all the data.

makitka2007 · February 11, 2019, 9:25am

sorry for not clear explanation, for search i need only data stored in first index, i will search over it and display on search results page. then only when user clicks on some result, i will send request to second index by document id to retrieve a document will all remaining fields to display.
will it be faster than store all fields initially in on document?

system · March 11, 2019, 9:25am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elastic Search performance issues when searching on docs with large field data Elasticsearch	6	1180	October 22, 2018
Index full document or subset? Elasticsearch	8	1056	February 6, 2020
Performance impact for searching high frequency word Elasticsearch	2	600	June 3, 2019
Elasticsearch behavior when stop indexing the biggest part of your messages Elasticsearch	1	596	January 6, 2020
Adding fields to an index has caused a performance hit Elasticsearch	1	408	November 13, 2019

Search over large documents

Related topics