'store=true' of field mapping?


(Min Cha) #1

Hello folks.

I have a simple question about "store=true" of field mapping.
I thought that "store=true" is like clusted-index on MySQL so I expected a
benefit when searching something and accessing stored fields because of
removing a random I/O for retrieving an original JSON.

In my performance testing, however, "store=true" looked like that there was
no performance benefit even if I used '_fields' on my script. '_doc' is
more faster than '_fields'. If this is a fact, why there is "store=true"
option?

Please give me some advice.
Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/70eef818-b3be-454d-b9fa-5f9856c18a70%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Alexander Reelsen) #2

Hey,

this is true. Using the store' parameter will does not have any impact on
the field being searchable (as the search is not executed using the stored
field. but rather on the inverted index). Setting this for a field will
result in the content of the field being stored as an own part. This is
useful if you do not store the whole source of a document (which is by
default returned in any search query), but still need to access certain
fields.

You should read the lucene documentation about this, if you are interested
for more (and the reason why this setting exists - and why you might be
able to pretty much ignore it with elasticsearch).

--Alex

On Mon, Dec 2, 2013 at 11:02 AM, Min Cha minslovey@gmail.com wrote:

Hello folks.

I have a simple question about "store=true" of field mapping.
I thought that "store=true" is like clusted-index on MySQL so I expected a
benefit when searching something and accessing stored fields because of
removing a random I/O for retrieving an original JSON.

In my performance testing, however, "store=true" looked like that there
was no performance benefit even if I used '_fields' on my script. '_doc' is
more faster than '_fields'. If this is a fact, why there is "store=true"
option?

Please give me some advice.
Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/70eef818-b3be-454d-b9fa-5f9856c18a70%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM9m9n20bi21RGgv2Mcn_dWZ4vwKRG8EXgFvK35bwTxVsQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Michael Lussier) #3

Hello Min Cha.

Hello folks.

I have a simple question about "store=true" of field mapping.

I assume you mean "store=yes"?

If this is a fact, why there is "store=true" option?

The reason for "store=yes" is, for instance, to explicitly store a field when _source is disabled. Or when the document fields are very large and you only want specific fields, saving performance from grabbing the very large _source and instead grabbing specific fields could be faster for the request. Though it can be a good idea in some cases to keep _source and specify stored fields. But in most cases storing _source and parsing it will be faster. Because when storing fields, say for example you store the "name","age","location" fields of a document by themselves. If you want to retrieve those fields from several documents. Retrieving each field in most cases will be slower than grabbing the _source and parsing "name","age","location" from the document. Also remember that store is not enabled by default, and _source is enabled by default.

This post by kimchy (shay banon)
http://elasticsearch-users.115913.n3.nabble.com/What-does-it-mean-to-quot-store-quot-a-field-td3514176.html#a3562299

And this post by javanna (Luca Cavanna)

These may offer more insight, I'm basically echoing what they are saying in the way I understand store.


(system) #4