List all entries matching a query


(sezgin küçükkaraaslan) #1

Is there a way to list all entries matching a query, without providing the
size parameter, because we don't know the actual size beforehand ?

Sezgin Kucukkaraaslan
www.ifountain.com


(Lukáš Vlček) #2

I don't think there is but you can learn the needed size using the count
query: http://www.elasticsearch.com/docs/elasticsearch/rest_api/count/ it
should be efficient.
Rgds,
Lukas

2010/9/14 sezgin küçükkaraaslan sezo104@gmail.com

Is there a way to list all entries matching a query, without providing the
size parameter, because we don't know the actual size beforehand ?

Sezgin Kucukkaraaslan
www.ifountain.com


(Andrew Harvey) #3

Set size to a Really Big Number™

A.

On 14/09/2010, at 4:38 PM, sezgin küçükkaraaslan wrote:

Is there a way to list all entries matching a query, without providing the size parameter, because we don't know the actual size beforehand ?

Sezgin Kucukkaraaslan
www.ifountain.com


(Lukáš Vlček) #4

Then I guess you would want to check
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/scroll/ as
well. Generally I am not sure elastic search is designed to return a huge
amount of documents in a single response.

On Tue, Sep 14, 2010 at 8:48 AM, Andrew Harvey synapsys@gmail.com wrote:

Set size to a Really Big Number™

A.

On 14/09/2010, at 4:38 PM, sezgin küçükkaraaslan wrote:

Is there a way to list all entries matching a query, without providing the
size parameter, because we don't know the actual size beforehand ?

Sezgin Kucukkaraaslan
www.ifountain.com


(Andrew Harvey) #5

I've found scroll to be patchy at best. There is an open bug for it.

The Really Big Number™ approach has its flaws, like the network load and the possibility that you could knock over your client or your server if the response generated is larger than either can handle. However, if your response size is always going to be < 1000 documents and they're of a reasonable size, you can probably get away with it.

A.

On 14/09/2010, at 4:53 PM, Lukáš Vlček wrote:

Then I guess you would want to check http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/scroll/ as well. Generally I am not sure elastic search is designed to return a huge amount of documents in a single response.

On Tue, Sep 14, 2010 at 8:48 AM, Andrew Harvey synapsys@gmail.com wrote:
Set size to a Really Big Number™

A.

On 14/09/2010, at 4:38 PM, sezgin küçükkaraaslan wrote:

Is there a way to list all entries matching a query, without providing the size parameter, because we don't know the actual size beforehand ?

Sezgin Kucukkaraaslan
www.ifountain.com


(sezgin küçükkaraaslan) #6

The Really Big Number approach was the first thing that came to my mind. But
I didn't want to provide 1000000 as size for 200 entries fearing that some
memory space might have been reserved beforehand, or something else hurting
the performance. If this scenario is ok with the performance, I can use it.

On Tue, Sep 14, 2010 at 9:56 AM, Andrew Harvey synapsys@gmail.com wrote:

I've found scroll to be patchy at best. There is an open bug for it.

The Really Big Number™ approach has its flaws, like the network load and
the possibility that you could knock over your client or your server if the
response generated is larger than either can handle. However, if your
response size is always going to be < 1000 documents and they're of a
reasonable size, you can probably get away with it.

A.

On 14/09/2010, at 4:53 PM, Lukáš Vlček wrote:

Then I guess you would want to check
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/scroll/ as
well. Generally I am not sure elastic search is designed to return a huge
amount of documents in a single response.

On Tue, Sep 14, 2010 at 8:48 AM, Andrew Harvey synapsys@gmail.com wrote:

Set size to a Really Big Number™

A.

On 14/09/2010, at 4:38 PM, sezgin küçükkaraaslan wrote:

Is there a way to list all entries matching a query, without providing the
size parameter, because we don't know the actual size beforehand ?

Sezgin Kucukkaraaslan
www.ifountain.com


(Paul Smith) #7

You definitely don't want to use just any big number because unless ES
does something extra fancy (maybe!) Lucene will allocate a
PriorityQueue with an array the size of your number of results. This
may 'work' but will not survive real world use I suspect.

The count API is what you want but it would be nice if ES could be
given a value that means 'all' results and it do the count in in the
one RPC call saving the round trip.

Paul

On Tuesday, September 14, 2010, sezgin küçükkaraaslan sezo104@gmail.com wrote:

The Really Big Number approach was the first thing that came to my mind. But I didn't want to provide 1000000 as size for 200 entries fearing that some memory space might have been reserved beforehand, or something else hurting the performance. If this scenario is ok with the performance, I can use it.

On Tue, Sep 14, 2010 at 9:56 AM, Andrew Harvey synapsys@gmail.com wrote:

I've found scroll to be patchy at best. There is an open bug for it.
The Really Big Number™ approach has its flaws, like the network load and the possibility that you could knock over your client or your server if the response generated is larger than either can handle. However, if your response size is always going to be < 1000 documents and they're of a reasonable size, you can probably get away with it.

A.
On 14/09/2010, at 4:53 PM, Lukáš Vlček wrote:
Then I guess you would want to check http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/scroll/ as well. Generally I am not sure elastic search is designed to return a huge amount of documents in a single response.

On Tue, Sep 14, 2010 at 8:48 AM, Andrew Harvey synapsys@gmail.com wrote:

Set size to a Really Big Number™

A.

On 14/09/2010, at 4:38 PM, sezgin küçükkaraaslan wrote:

Is there a way to list all entries matching a query, without providing the size parameter, because we don't know the actual size beforehand ?

Sezgin Kucukkaraaslan
www.ifountain.com http://www.ifountain.com/


(sezgin küçükkaraaslan) #8

Thanks for the notice,
Yes, If ES can provide something like "all", that would be great.

Sezgin Kucukkaraaslan

On Tue, Sep 14, 2010 at 10:49 AM, Paul Smith tallpsmith@gmail.com wrote:

You definitely don't want to use just any big number because unless ES
does something extra fancy (maybe!) Lucene will allocate a
PriorityQueue with an array the size of your number of results. This
may 'work' but will not survive real world use I suspect.

The count API is what you want but it would be nice if ES could be
given a value that means 'all' results and it do the count in in the
one RPC call saving the round trip.

Paul

On Tuesday, September 14, 2010, sezgin küçükkaraaslan sezo104@gmail.com
wrote:

The Really Big Number approach was the first thing that came to my mind.
But I didn't want to provide 1000000 as size for 200 entries fearing that
some memory space might have been reserved beforehand, or something else
hurting the performance. If this scenario is ok with the performance, I can
use it.

On Tue, Sep 14, 2010 at 9:56 AM, Andrew Harvey synapsys@gmail.com
wrote:

I've found scroll to be patchy at best. There is an open bug for it.
The Really Big Number™ approach has its flaws, like the network load and
the possibility that you could knock over your client or your server if the
response generated is larger than either can handle. However, if your
response size is always going to be < 1000 documents and they're of a
reasonable size, you can probably get away with it.

A.
On 14/09/2010, at 4:53 PM, Lukáš Vlček wrote:
Then I guess you would want to check
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/scroll/ as
well. Generally I am not sure elastic search is designed to return a huge
amount of documents in a single response.

On Tue, Sep 14, 2010 at 8:48 AM, Andrew Harvey synapsys@gmail.com
wrote:

Set size to a Really Big Number™

A.

On 14/09/2010, at 4:38 PM, sezgin küçükkaraaslan wrote:

Is there a way to list all entries matching a query, without providing
the size parameter, because we don't know the actual size beforehand ?

Sezgin Kucukkaraaslan
www.ifountain.com http://www.ifountain.com/


(Shay Banon) #9

This can be done (though will be a bit slower). Open an issue for this.

2010/9/14 sezgin küçükkaraaslan sezo104@gmail.com

Thanks for the notice,
Yes, If ES can provide something like "all", that would be great.

Sezgin Kucukkaraaslan

On Tue, Sep 14, 2010 at 10:49 AM, Paul Smith tallpsmith@gmail.com wrote:

You definitely don't want to use just any big number because unless ES
does something extra fancy (maybe!) Lucene will allocate a
PriorityQueue with an array the size of your number of results. This
may 'work' but will not survive real world use I suspect.

The count API is what you want but it would be nice if ES could be
given a value that means 'all' results and it do the count in in the
one RPC call saving the round trip.

Paul

On Tuesday, September 14, 2010, sezgin küçükkaraaslan sezo104@gmail.com
wrote:

The Really Big Number approach was the first thing that came to my mind.
But I didn't want to provide 1000000 as size for 200 entries fearing that
some memory space might have been reserved beforehand, or something else
hurting the performance. If this scenario is ok with the performance, I can
use it.

On Tue, Sep 14, 2010 at 9:56 AM, Andrew Harvey synapsys@gmail.com
wrote:

I've found scroll to be patchy at best. There is an open bug for it.
The Really Big Number™ approach has its flaws, like the network load and
the possibility that you could knock over your client or your server if the
response generated is larger than either can handle. However, if your
response size is always going to be < 1000 documents and they're of a
reasonable size, you can probably get away with it.

A.
On 14/09/2010, at 4:53 PM, Lukáš Vlček wrote:
Then I guess you would want to check
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/scroll/ as
well. Generally I am not sure elastic search is designed to return a huge
amount of documents in a single response.

On Tue, Sep 14, 2010 at 8:48 AM, Andrew Harvey synapsys@gmail.com
wrote:

Set size to a Really Big Number™

A.

On 14/09/2010, at 4:38 PM, sezgin küçükkaraaslan wrote:

Is there a way to list all entries matching a query, without providing
the size parameter, because we don't know the actual size beforehand ?

Sezgin Kucukkaraaslan
www.ifountain.com http://www.ifountain.com/


(sezgin küçükkaraaslan) #10

Thanks...

On Tue, Sep 14, 2010 at 4:39 PM, Shay Banon shay.banon@elasticsearch.comwrote:

This can be done (though will be a bit slower). Open an issue for this.

2010/9/14 sezgin küçükkaraaslan sezo104@gmail.com

Thanks for the notice,

Yes, If ES can provide something like "all", that would be great.

Sezgin Kucukkaraaslan

On Tue, Sep 14, 2010 at 10:49 AM, Paul Smith tallpsmith@gmail.comwrote:

You definitely don't want to use just any big number because unless ES
does something extra fancy (maybe!) Lucene will allocate a
PriorityQueue with an array the size of your number of results. This
may 'work' but will not survive real world use I suspect.

The count API is what you want but it would be nice if ES could be
given a value that means 'all' results and it do the count in in the
one RPC call saving the round trip.

Paul

On Tuesday, September 14, 2010, sezgin küçükkaraaslan sezo104@gmail.com
wrote:

The Really Big Number approach was the first thing that came to my
mind. But I didn't want to provide 1000000 as size for 200 entries fearing
that some memory space might have been reserved beforehand, or something
else hurting the performance. If this scenario is ok with the performance, I
can use it.

On Tue, Sep 14, 2010 at 9:56 AM, Andrew Harvey synapsys@gmail.com
wrote:

I've found scroll to be patchy at best. There is an open bug for it.
The Really Big Number™ approach has its flaws, like the network load
and the possibility that you could knock over your client or your server if
the response generated is larger than either can handle. However, if your
response size is always going to be < 1000 documents and they're of a
reasonable size, you can probably get away with it.

A.
On 14/09/2010, at 4:53 PM, Lukáš Vlček wrote:
Then I guess you would want to check
http://www.elasticsearch.com/docs/elasticsearch/rest_api/search/scroll/ as
well. Generally I am not sure elastic search is designed to return a huge
amount of documents in a single response.

On Tue, Sep 14, 2010 at 8:48 AM, Andrew Harvey synapsys@gmail.com
wrote:

Set size to a Really Big Number™

A.

On 14/09/2010, at 4:38 PM, sezgin küçükkaraaslan wrote:

Is there a way to list all entries matching a query, without providing
the size parameter, because we don't know the actual size beforehand ?

Sezgin Kucukkaraaslan
www.ifountain.com http://www.ifountain.com/


(system) #11