ElasticSearch returning less results even with set Size = 50


#1

On running my ES query, total hits received are shown to be 7746
Yet, setting the size param to 50, the query is returning only 46 results.
What could be the reason for ES not returning the last 4 results?

The top score is 1.17, while that of the 46th result is 0.17.
On setting the size to 70, the lowest ranking record has score 0.15. Thus score of the 50th record must be between the two value.

Does ES maintain a threshold of sorts, based on the top score? Below which results are not returned despite the size?


(Mark Harwood) #2

No. I think we'll need to see more details about your specific query/data/mapping to debug further.


(Amar - Persistent Systems) #3

Can you please post your query ?


#4

Hi,

Thanks for the prompt response.
The index contains documents, total size being: 1.9gb, no. of docs ~ 0.5 million
This is what the index mapping looks like:

"mapping_cluster" : {
"mappings" : {
"articles" : {
"properties" : {
"alexaRank" : {
"type" : "long"
},
"content" : {
"type" : "string"
},
"description" : {
"type" : "string"
},
"imageURL" : {
"type" : "string"
},
"pubDate" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
},
"sourceName" : {
"type" : "string"
},
"title" : {
"type" : "string"
},
"url" : {
"type" : "string"
}
}
}
}
}

And the query is as follows:
curl -XPOST 'localhost:9200/search_cluster/_search?pretty&size=50' -d'
{ "query": {
"function_score" : {
"query" : {
"multi_match" : {
"query" : "Chicago",
"fields" : [ "title", "url", "content^3" ],
"type" : "phrase",
"slop" : 10,
"tie_breaker" : 0.3
}
},
"functions" : [ {
"gauss" : {
"pubDate" : {
"scale" : "8h",
"decay" : 0.7
}
}
} ],
"score_mode" : "multiply"
}
}}'


(Mark Harwood) #5

I expect the function_score is the culprit here - if vanilla ranking was broken in the way you described we'd certainly know about it.
What version of elasticsearch are you using here?
In the interests of getting down to the smallest reproducible example we can test do you see this issue if you replace the multi_match with a match_all query?


#6

Hey,

ES version: 2.3.4
Having the same issue with match_all query as well. getting 35-36 docs only.


(Mark Harwood) #7

That helps narrow things down. Any chance you can supply the date values for the docs in your problem set?


#8

I realised later that their was some kind of filtering happening in the system, at a later stage, which i had missed initially. The number of articles being returned right after the query was executed were correct.
Sorry for the trouble.
Thank you for all the help!


(Mark Harwood) #9

...you had us worried here. We've released the author of function_score from the interrogation room and he has agreed not to press any charges.

All good.


#10

Haha!
Thanks :smile:


(system) #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.