Reindex: how to limit number of docs for testing:

I am having problem getting reindex to do what I want and the indexes I have available for testing have over 100,000 docs and when things fail this is 100.000 tracebacks in the logs!

I see from the docs that there is a max_docs parameter under query options but no example of use so I have assumed that the search is associated with the source and tried this:


  def do_reindex( name = nil, new_name = nil)
    params = { source: { index: name || @name }, target: { index: new_name || @options[:new_name]} }
    params[:script] = @options[:script] if  @options[:script]
    if  @options[:max_docs]
      params[:source][:query]  = { max_docs: @options[:max_docs] }
    end
    pp params

   @client.reindex( params )
end

prints:

there are 1 indices in authx_2019.09.02 to reindex
{:source=>{:index=>"authx_2019.09.02", :query=>{:max_docs=>10}},
 :target=>{:index=>"authentication_2019.09.02"}}
warning: 299 Elasticsearch-7.10.0-51e9d6f22758d0374a0f3f5c6e8f3a7997850f96 "[types removal] Specifying types in bulk requests is deprecated."
warning: 299 Elasticsearch-7.10.0-51e9d6f22758d0374a0f3f5c6e8f3a7997850f96 "[types removal] Specifying types in bulk requests is deprecated."

BTW ruby api converts the ruby hash into json.

If the max_docs was set up correctly it would need only one bulk operation.

I suspect this is an issue with the ruby binding for the api and have logged an issue on the github repository. After a but of experimentation I discovered I could add any thing I liked to the source and target hashes and they would be completely ignored -- ie no errors for garbage.

Further confirmation is that it works when I do the request from ruby with an http post directly.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.