Is there ever a reason to store _id?

According to the documentation on _id
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-id-field.htmlit
is possible to store _id but it never gives a reason why that would be
useful.

I have a use case where I am exporting all ids from ES using scan/scroll
with no query. If I set the fields parameter to nothing/blank I get back
the _id automatically. I assume this happens by parsing the _uid. If I
store the _id I get back the _id in both the metadata section of the
document and the fields property which seems redundant.

I am a little unsure what ES does when a request for no fields and no query
come in. I assume it's scanning something (what?) and then fetching the
metadata from somewhere (where?). If what it's scanning and what it's
fetching from are the same thing then storing the _id seems moot.

So, Is there any performance advantage to storing the _id for scan/scroll
requests, or in any specific case?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8dc87e17-0826-4871-afab-419e084a5883%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

I wanted to give this a friendly bump and follow up with my experience.

After doing some light testing I can't see a reason to ever store _id.
Doing so inflated the index size and response objects but offered no
improvements on scanning. So, at least for my case it doesn't seem to make
sense. perhaps there is another use case I am missing.

Thanks,
Andrew White

On Wednesday, January 21, 2015 at 7:11:38 AM UTC-6, Andrew White wrote:

According to the documentation on _id
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-id-field.htmlit
is possible to store _id but it never gives a reason why that would be
useful.

I have a use case where I am exporting all ids from ES using scan/scroll
with no query. If I set the fields parameter to nothing/blank I get back
the _id automatically. I assume this happens by parsing the _uid. If I
store the _id I get back the _id in both the metadata section of the
document and the fields property which seems redundant.

I am a little unsure what ES does when a request for no fields and no
query come in. I assume it's scanning something (what?) and then fetching
the metadata from somewhere (where?). If what it's scanning and what it's
fetching from are the same thing then storing the _id seems moot.

So, Is there any performance advantage to storing the _id for scan/scroll
requests, or in any specific case?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3ea84863-63ac-4719-b54f-6a6bc0bb1cfa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Setting fields to "stored" in Elasticsearch in general is not required and
a bad practice, since all fields are extracted from _soruce when they are
required and _source benefits from block compression and more.

There are only some very few edge cases where you want to not save the
_source and enable "stored" for a few fields (usually several small ones
out of many) that this feature becomes helpful.

HTH

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Lucene.NET committer and PMC member

On Wed, Feb 4, 2015 at 2:57 PM, Andrew White andrew@datarank.com wrote:

I wanted to give this a friendly bump and follow up with my experience.

After doing some light testing I can't see a reason to ever store _id.
Doing so inflated the index size and response objects but offered no
improvements on scanning. So, at least for my case it doesn't seem to make
sense. perhaps there is another use case I am missing.

Thanks,
Andrew White

On Wednesday, January 21, 2015 at 7:11:38 AM UTC-6, Andrew White wrote:

According to the documentation on _id
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-id-field.htmlit
is possible to store _id but it never gives a reason why that would be
useful.

I have a use case where I am exporting all ids from ES using scan/scroll
with no query. If I set the fields parameter to nothing/blank I get back
the _id automatically. I assume this happens by parsing the _uid. If I
store the _id I get back the _id in both the metadata section of the
document and the fields property which seems redundant.

I am a little unsure what ES does when a request for no fields and no
query come in. I assume it's scanning something (what?) and then fetching
the metadata from somewhere (where?). If what it's scanning and what it's
fetching from are the same thing then storing the _id seems moot.

So, Is there any performance advantage to storing the _id for scan/scroll
requests, or in any specific case?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3ea84863-63ac-4719-b54f-6a6bc0bb1cfa%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/3ea84863-63ac-4719-b54f-6a6bc0bb1cfa%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtD6DB3EOVsUQM3ro0FUfvkG3o%2BBv77CkqfaeUL0qdUnw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.