Hot to get all document _id of an elasticsearch index


(Hanan) #1

I am trying to pull list of all doc_ids in Elasticsearch index. the index contains 15 million docs, I am only interested in doc_id. I have set search query in java by setting "size" parameter to 15 million, but i got exception the maximum size is "10000". any idea how i can achieve this task?

I am using simple httpPost request to submit the query as follows:

"stored_fields": ["au.id","au.doc_count"],
"query": {
"match_all": {}
}


(Henning Andersen) #2

Hi @telebh,

you should be able to pull out all the ids using a scroll search, see:

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html


(Hanan) #3

Yes, that worked out. thank you Henning!