Scoring of queries on nested documents

Hello,
I am having a problem understanding how scoring of nested documents works.
I have found other people with similar questions which have remained
unanswered:

The relevant section of my current mapping (with nested parts) is:
mappings: {

person: {
    properties: {
        city: {
            type: nested
            properties: {
                visityear: {
                    type: integer
                }
                name: {
                    type: string
                }
            }
        }
    }
}

}

If I have three people who have visited different numbers of cities and I
search for a common city they have all visited I get different score
values. The person who visited the greatest number of cities is ranked
first, with the person who visited only one city getting a score of 1
(currently ranked lowest). The output of the explanation is that hthe score
is based on 'child doc range from 0 to x'. My question is how do TF, IDF
and Field Norm work for nested documents when the score is being
calculated?

Many thanks,
Barry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b6dd8305-43df-4146-89f2-28fea0264f61%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

1 Like

Edit: There is only one shard being used in this mapping.

On Tuesday, October 21, 2014 9:56:51 AM UTC+1, ba...@intalex.com wrote:

Hello,
I am having a problem understanding how scoring of nested documents works.
I have found other people with similar questions which have remained
unanswered:

http://stackoverflow.com/questions/25619632/elasticsearch-how-is-the-score-for-nested-queries-computed

http://stackoverflow.com/questions/26263562/elasticsearch-boost-score-with-nested-query

The relevant section of my current mapping (with nested parts) is:
mappings: {

person: {
    properties: {
        city: {
            type: nested
            properties: {
                visityear: {
                    type: integer
                }
                name: {
                    type: string
                }
            }
        }
    }
}

}

If I have three people who have visited different numbers of cities and I
search for a common city they have all visited I get different score
values. The person who visited the greatest number of cities is ranked
first, with the person who visited only one city getting a score of 1
(currently ranked lowest). The output of the explanation is that hthe score
is based on 'child doc range from 0 to x'. My question is how do TF, IDF
and Field Norm work for nested documents when the score is being
calculated?

Many thanks,
Barry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ca8a63ee-9700-4ac6-8a4b-b6bb4362ee85%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

The "score_mode" setting determines how the scores of the various child
docs are attributed to the parent doc which is the final scored element.
See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html#query-dsl-nested-query

You can for example choose to take the average, max or sum of all the child
documents that match your nested query and reward the parent doc with that
value

On Tuesday, October 21, 2014 9:56:51 AM UTC+1, ba...@intalex.com wrote:

Hello,
I am having a problem understanding how scoring of nested documents works.
I have found other people with similar questions which have remained
unanswered:

http://stackoverflow.com/questions/25619632/elasticsearch-how-is-the-score-for-nested-queries-computed

http://stackoverflow.com/questions/26263562/elasticsearch-boost-score-with-nested-query

The relevant section of my current mapping (with nested parts) is:
mappings: {

person: {
    properties: {
        city: {
            type: nested
            properties: {
                visityear: {
                    type: integer
                }
                name: {
                    type: string
                }
            }
        }
    }
}

}

If I have three people who have visited different numbers of cities and I
search for a common city they have all visited I get different score
values. The person who visited the greatest number of cities is ranked
first, with the person who visited only one city getting a score of 1
(currently ranked lowest). The output of the explanation is that hthe score
is based on 'child doc range from 0 to x'. My question is how do TF, IDF
and Field Norm work for nested documents when the score is being
calculated?

Many thanks,
Barry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/071b29f4-2684-4239-bbae-5395d6bbc13c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

1 Like

Thanks for the help Mark.
When calculating relevance can I assume that TF is the number of times that
the term appears in the collapsed nested field? I.e. all of the city names
get merged into one field, or is it handled a different way? Is the Field
Length Norm calculated in the same way?

Barry

On Tuesday, October 21, 2014 3:48:15 PM UTC+1, Mark Harwood wrote:

The "score_mode" setting determines how the scores of the various child
docs are attributed to the parent doc which is the final scored element.
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html#query-dsl-nested-query

You can for example choose to take the average, max or sum of all the
child documents that match your nested query and reward the parent doc with
that value

On Tuesday, October 21, 2014 9:56:51 AM UTC+1, ba...@intalex.com wrote:

Hello,
I am having a problem understanding how scoring of nested documents
works. I have found other people with similar questions which have remained
unanswered:

http://stackoverflow.com/questions/25619632/elasticsearch-how-is-the-score-for-nested-queries-computed

http://stackoverflow.com/questions/26263562/elasticsearch-boost-score-with-nested-query

The relevant section of my current mapping (with nested parts) is:
mappings: {

person: {
    properties: {
        city: {
            type: nested
            properties: {
                visityear: {
                    type: integer
                }
                name: {
                    type: string
                }
            }
        }
    }
}

}

If I have three people who have visited different numbers of cities and I
search for a common city they have all visited I get different score
values. The person who visited the greatest number of cities is ranked
first, with the person who visited only one city getting a score of 1
(currently ranked lowest). The output of the explanation is that hthe score
is based on 'child doc range from 0 to x'. My question is how do TF, IDF
and Field Norm work for nested documents when the score is being
calculated?

Many thanks,
Barry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2b9d6917-a085-4ac1-930a-8e8a70c7c1ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

After some investigation, the number of nested docs get counted
individually along with the root doc.

On Tuesday, October 21, 2014 4:55:56 PM UTC+1, ba...@intalex.com wrote:

Thanks for the help Mark.
When calculating relevance can I assume that TF is the number of times
that the term appears in the collapsed nested field? I.e. all of the city
names get merged into one field, or is it handled a different way? Is the
Field Length Norm calculated in the same way?

Barry

On Tuesday, October 21, 2014 3:48:15 PM UTC+1, Mark Harwood wrote:

The "score_mode" setting determines how the scores of the various child
docs are attributed to the parent doc which is the final scored element.
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html#query-dsl-nested-query

You can for example choose to take the average, max or sum of all the
child documents that match your nested query and reward the parent doc with
that value

On Tuesday, October 21, 2014 9:56:51 AM UTC+1, ba...@intalex.com wrote:

Hello,
I am having a problem understanding how scoring of nested documents
works. I have found other people with similar questions which have remained
unanswered:

http://stackoverflow.com/questions/25619632/elasticsearch-how-is-the-score-for-nested-queries-computed

http://stackoverflow.com/questions/26263562/elasticsearch-boost-score-with-nested-query

The relevant section of my current mapping (with nested parts) is:
mappings: {

person: {
    properties: {
        city: {
            type: nested
            properties: {
                visityear: {
                    type: integer
                }
                name: {
                    type: string
                }
            }
        }
    }
}

}

If I have three people who have visited different numbers of cities and
I search for a common city they have all visited I get different score
values. The person who visited the greatest number of cities is ranked
first, with the person who visited only one city getting a score of 1
(currently ranked lowest). The output of the explanation is that hthe score
is based on 'child doc range from 0 to x'. My question is how do TF, IDF
and Field Norm work for nested documents when the score is being
calculated?

Many thanks,
Barry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d8809201-3806-4a49-9b87-7eb0c2e02dc2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.