Performance impact due to _source storage


(Imran Siddique) #1

If we store the _source so that we can do partial update

https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html

Other than having a hit on disk space needed to sore the data, do we have any known query scenario where performance will be hit?


(Nik Everett) #2

Typically storing source is the right way to go but if you frequently
highlight or return small fields (< 100KB) on documents that contain large
fields (> 2MB) you'll start to see decoding the source be a performance hit.

On the other hand this is something that the Elasticsearch team is talking
about working on so I expect that problem will disappear in a while. Its a
bit of a complex issue but its a recognized one.


(Imran Siddique) #3

thanks @nik9000


(Imran Siddique) #4

@nik9000 - Looks like I'm running into the issue you explained. My _source is big in size and I'm using nested mapping. I've bunch of fields but I'm interested to retrieve only a few. One of the field to be retrieved is nested field. I tried setting store to true for nested field but looks like I'm running into this issue: https://github.com/elastic/elasticsearch/issues/5245

@johtani / @Martijn_van_Groninge - was this issue fixed? [https://github.com/elastic/elasticsearch/issues/5245]


(system) #5