AND filter with Size and From gives different results


(Dusty Doris) #1

I am running a filter against an index using a size and from parameter and
getting different results on the same query. How can I make this filter
return consistent results so I can make the size/to useful?

all results

curl
'http://localhost:9200/providers/provider/_search?pretty=true&fields=_id'
-d
'{"filter":{"and":[{"term":{"names.last_name":"doris"}},{"term":{"addresses.state":"fl"}}]}}'

{
"took" : 34,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1013944560",
"_score" : 1.0
}, {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
}, {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

size=1, from=1

running 4 times cycles through two entries

curl
'http://localhost:9200/providers/provider/_search?pretty=true&size=1&from=1&fields=_id'
-d
'{"filter":{"and":[{"term":{"names.last_name":"doris"}},{"term":{"addresses.state":"fl"}}]}}'

{
"took" : 35,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

{
"took" : 25,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
} ]
}
}

{
"took" : 33,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

{
"took" : 25,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
} ]
}
}


(David Pilato) #2

Your concern is that you are doing a matchAll query.
So scoring is always 1.
ES sort results by score.

You can sort results by _id.

HTH
David

--

Le 7 août 2012 à 06:58, Dusty Doris dusty@doris.name a écrit :

I am running a filter against an index using a size and from parameter and getting different results on the same query. How can I make this filter return consistent results so I can make the size/to useful?

all results

curl 'http://localhost:9200/providers/provider/_search?pretty=true&fields=_id' -d
'{"filter":{"and":[{"term":{"names.last_name":"doris"}},{"term":{"addresses.state":"fl"}}]}}'

{
"took" : 34,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1013944560",
"_score" : 1.0
}, {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
}, {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

size=1, from=1

running 4 times cycles through two entries

curl 'http://localhost:9200/providers/provider/_search?pretty=true&size=1&from=1&fields=_id' -d
'{"filter":{"and":[{"term":{"names.last_name":"doris"}},{"term":{"addresses.state":"fl"}}]}}'

{
"took" : 35,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

{
"took" : 25,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
} ]
}
}

{
"took" : 33,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

{
"took" : 25,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
} ]
}
}


(Dusty Doris) #3

Ok, sorting by _id will work.

Just to clarify, in case I run into this with more advanced searches as
well. If two entries have equal scores, the order they are returned is
not guaranteed to be the same on subsequent searches unless a sort on
something unique is performed?

Thanks.

On Tuesday, August 7, 2012 2:01:13 AM UTC-4, David Pilato wrote:

Your concern is that you are doing a matchAll query.
So scoring is always 1.
ES sort results by score.

You can sort results by _id.

HTH
David

--

Le 7 août 2012 à 06:58, Dusty Doris dusty@doris.name a écrit :

I am running a filter against an index using a size and from parameter
and getting different results on the same query. How can I make this
filter return consistent results so I can make the size/to useful?

all results

curl '
http://localhost:9200/providers/provider/_search?pretty=true&fields=_id'
-d \

'{"filter":{"and":[{"term":{"names.last_name":"doris"}},{"term":{"addresses.state":"fl"}}]}}'

{
"took" : 34,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1013944560",
"_score" : 1.0
}, {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
}, {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

size=1, from=1

running 4 times cycles through two entries

curl '
http://localhost:9200/providers/provider/_search?pretty=true&size=1&from=1&fields=_id'
-d \

'{"filter":{"and":[{"term":{"names.last_name":"doris"}},{"term":{"addresses.state":"fl"}}]}}'

{
"took" : 35,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

{
"took" : 25,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
} ]
}
}

{
"took" : 33,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

{
"took" : 25,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
} ]
}
}


(David Pilato) #4

Yes.

You may look at Scan & Scroll feature: http://www.elasticsearch.org/guide/reference/api/search/scroll.html

It could help.
David

--

Le 7 août 2012 à 08:09, Dusty Doris dusty@doris.name a écrit :

Ok, sorting by _id will work.

Just to clarify, in case I run into this with more advanced searches as well. If two entries have equal scores, the order they are returned is not guaranteed to be the same on subsequent searches unless a sort on something unique is performed?

Thanks.

On Tuesday, August 7, 2012 2:01:13 AM UTC-4, David Pilato wrote:
Your concern is that you are doing a matchAll query.
So scoring is always 1.
ES sort results by score.

You can sort results by _id.

HTH
David

--

Le 7 août 2012 à 06:58, Dusty Doris dusty@doris.name a écrit :

I am running a filter against an index using a size and from parameter and getting different results on the same query. How can I make this filter return consistent results so I can make the size/to useful?

all results

curl 'http://localhost:9200/providers/provider/_search?pretty=true&fields=_id' -d \
'{"filter":{"and":[{"term":{"names.last_name":"doris"}},{"term":{"addresses.state":"fl"}}]}}'

{
"took" : 34,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1013944560",
"_score" : 1.0
}, {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
}, {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

size=1, from=1

running 4 times cycles through two entries

curl 'http://localhost:9200/providers/provider/_search?pretty=true&size=1&from=1&fields=_id' -d \
'{"filter":{"and":[{"term":{"names.last_name":"doris"}},{"term":{"addresses.state":"fl"}}]}}'

{
"took" : 35,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

{
"took" : 25,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
} ]
}
}

{
"took" : 33,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

{
"took" : 25,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
} ]
}
}


(Dusty Doris) #5

Thank you David.

On Tuesday, August 7, 2012 2:19:46 AM UTC-4, David Pilato wrote:

Yes.

You may look at Scan & Scroll feature:
http://www.elasticsearch.org/guide/reference/api/search/scroll.html

It could help.
David

--

Le 7 août 2012 à 08:09, Dusty Doris dusty@doris.name a écrit :

Ok, sorting by _id will work.

Just to clarify, in case I run into this with more advanced searches as
well. If two entries have equal scores, the order they are returned is
not guaranteed to be the same on subsequent searches unless a sort on
something unique is performed?

Thanks.

On Tuesday, August 7, 2012 2:01:13 AM UTC-4, David Pilato wrote:

Your concern is that you are doing a matchAll query.
So scoring is always 1.
ES sort results by score.

You can sort results by _id.

HTH
David

--

Le 7 août 2012 à 06:58, Dusty Doris dusty@doris.name a écrit :

I am running a filter against an index using a size and from parameter
and getting different results on the same query. How can I make this
filter return consistent results so I can make the size/to useful?

all results

curl '
http://localhost:9200/providers/provider/_search?pretty=true&fields=_id'
-d \

'{"filter":{"and":[{"term":{"names.last_name":"doris"}},{"term":{"addresses.state":"fl"}}]}}'

{
"took" : 34,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1013944560",
"_score" : 1.0
}, {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
}, {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

size=1, from=1

running 4 times cycles through two entries

curl '
http://localhost:9200/providers/provider/_search?pretty=true&size=1&from=1&fields=_id'
-d \

'{"filter":{"and":[{"term":{"names.last_name":"doris"}},{"term":{"addresses.state":"fl"}}]}}'

{
"took" : 35,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

{
"took" : 25,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
} ]
}
}

{
"took" : 33,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1568607125",
"_score" : 1.0
} ]
}
}

{
"took" : 25,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "providers",
"_type" : "provider",
"_id" : "1851535512",
"_score" : 1.0
} ]
}
}


(system) #6