How to get all document _id of an elasticsearch index

shenwno · August 28, 2014, 7:52pm

Hi all,

I'm trying to figure out a way to retrieve all the document '_id' (ES
internal _id) from an index, e.g. the index has about 20 million documents.
However, by using the get api, ES will do a paging and only return part of
the data.
Not sure if the bulk api could handle this task, but with the scale of the
index, it's still a heavy query.
Is there anyway I can retrieve against the raw filesystem?

Thanks for helping
Wei

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b22e2015-f8cc-408a-857a-14a4fbbcf6a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

vineeth_mohan_2 · August 29, 2014, 1:53am

Hello Wei ,

You can scan through all the documents in ES using scan and scroll -

Thanks
Vineeth

On Fri, Aug 29, 2014 at 1:22 AM, Wei Shen shenwno@gmail.com wrote:

Hi all,

I'm trying to figure out a way to retrieve all the document '_id' (ES
internal _id) from an index, e.g. the index has about 20 million
documents.
However, by using the get api, ES will do a paging and only return part of
the data.
Not sure if the bulk api could handle this task, but with the scale of the
index, it's still a heavy query.
Is there anyway I can retrieve against the raw filesystem?

Thanks for helping
Wei

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b22e2015-f8cc-408a-857a-14a4fbbcf6a0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b22e2015-f8cc-408a-857a-14a4fbbcf6a0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mYcG8AjCY2gRiRXkmmNByAHK2pJv_ajwOuofyP8c4zgQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
How to retrieve just certain amount of docs from a larger query? Elasticsearch	2	415	July 6, 2017
Get all documents from an index Elasticsearch	10	107522	June 21, 2017
Get last indexed document id Elasticsearch	2	2633	July 6, 2017
Default Routing and Get API performance Elasticsearch	2	346	July 6, 2017
How can we get all ID's which are generated by Elastic Search for each record while using bulk insert? Elasticsearch	2	318	July 6, 2017

How to get all document _id of an elasticsearch index

Related topics