I'm trying to implement an index exporter by doing a "match_all" query
and scrolling through the entire index, 100 or 1000 documents at a
time. I'm seeing a significant slowdown in scrolling over time. The
first scroll via the rest api returns in < 50ms, but once I've
scrolled through 1.5 million of the 2 million total docs, the time to
execute is > 1 second. I've set the scroll timeout to 10 seconds,
which performs better than 10 minutes, but I can't decrease the
timeout much more without risking timing out between calls.
I'm wondering if a) this dramatic slowdown is expected, and b) if
there's a better way to scroll through all documents quickly.
I'm trying to implement an index exporter by doing a "match_all" query
and scrolling through the entire index, 100 or 1000 documents at a
time. I'm seeing a significant slowdown in scrolling over time. The
first scroll via the rest api returns in < 50ms, but once I've
scrolled through 1.5 million of the 2 million total docs, the time to
execute is > 1 second. I've set the scroll timeout to 10 seconds,
which performs better than 10 minutes, but I can't decrease the
timeout much more without risking timing out between calls.
I'm wondering if a) this dramatic slowdown is expected, and b) if
there's a better way to scroll through all documents quickly.
I'm trying to implement an index exporter by doing a "match_all" query
and scrolling through the entire index, 100 or 1000 documents at a
time. I'm seeing a significant slowdown in scrolling over time. The
first scroll via the rest api returns in < 50ms, but once I've
scrolled through 1.5 million of the 2 million total docs, the time to
execute is > 1 second. I've set the scroll timeout to 10 seconds,
which performs better than 10 minutes, but I can't decrease the
timeout much more without risking timing out between calls.
I'm wondering if a) this dramatic slowdown is expected, and b) if
there's a better way to scroll through all documents quickly.
I'm trying to implement an index exporter by doing a "match_all" query
and scrolling through the entire index, 100 or 1000 documents at a
time. I'm seeing a significant slowdown in scrolling over time. The
first scroll via the rest api returns in < 50ms, but once I've
scrolled through 1.5 million of the 2 million total docs, the time to
execute is > 1 second. I've set the scroll timeout to 10 seconds,
which performs better than 10 minutes, but I can't decrease the
timeout much more without risking timing out between calls.
I'm wondering if a) this dramatic slowdown is expected, and b) if
there's a better way to scroll through all documents quickly.
I'm trying to implement an index exporter by doing a "match_all" query
and scrolling through the entire index, 100 or 1000 documents at a
time. I'm seeing a significant slowdown in scrolling over time. The
first scroll via the rest api returns in < 50ms, but once I've
scrolled through 1.5 million of the 2 million total docs, the time to
execute is > 1 second. I've set the scroll timeout to 10 seconds,
which performs better than 10 minutes, but I can't decrease the
timeout much more without risking timing out between calls.
I'm wondering if a) this dramatic slowdown is expected, and b) if
there's a better way to scroll through all documents quickly.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.