Score based on term frequency only

kevins · January 5, 2014, 9:57pm

I would like to score based entirely on term count.

For example, given the following two documents:

{ "apple" }
{ "apple apple" }

Searching "apple" ranks the first before the second. I wish to rank the
second, in which the term occurs twice, with a higher score.

Can someone please point me in the right direction for this?

Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1bb386ae-3ab5-4878-9d29-6462eaff14c7%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · January 6, 2014, 1:13am

You could provide your own Similarity class as a plugin. Don't have any
sample code in front of me, but it would be based of TFIDFSimilarity and
you would basically needed to ignore the norms and other values.

http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html

The IDF portion could probably remain since it ranks the different terms in
your query, not the score of each term.

Cheers,

Ivan

On Sun, Jan 5, 2014 at 1:57 PM, Kevin S kevinsteger@gmail.com wrote:

I would like to score based entirely on term count.

For example, given the following two documents:

{ "apple" }

{ "apple apple" }

Searching "apple" ranks the first before the second. I wish to rank the
second, in which the term occurs twice, with a higher score.

Can someone please point me in the right direction for this?

Thank you.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1bb386ae-3ab5-4878-9d29-6462eaff14c7%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBwEy7UgdqYQmX3EuO71TwSAMCnDp7hdSkcvxLwH5jMJw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Britta_Weber · January 7, 2014, 4:31pm

You could also use a script as described here:

Cheers,
Britta

On Mon, Jan 6, 2014 at 2:13 AM, Ivan Brusic ivan@brusic.com wrote:

You could provide your own Similarity class as a plugin. Don't have any
sample code in front of me, but it would be based of TFIDFSimilarity and
you would basically needed to ignore the norms and other values.

TFIDFSimilarity (Lucene 4.6.0 API)

The IDF portion could probably remain since it ranks the different terms in
your query, not the score of each term.

Cheers,

Ivan

On Sun, Jan 5, 2014 at 1:57 PM, Kevin S kevinsteger@gmail.com wrote:

I would like to score based entirely on term count.

For example, given the following two documents:

{ "apple" }

{ "apple apple" }

Searching "apple" ranks the first before the second. I wish to rank the
second, in which the term occurs twice, with a higher score.

Can someone please point me in the right direction for this?

Thank you.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1bb386ae-3ab5-4878-9d29-6462eaff14c7%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBwEy7UgdqYQmX3EuO71TwSAMCnDp7hdSkcvxLwH5jMJw%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALhJbBiFtgJOfhBqXkS-%2B2YWnDy81j7c5jaSFEkG%3DVizqTpykg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · January 7, 2014, 5:26pm

Great feature. However, it looks like it is only available in the master
branch: Add support for using payloads to boost terms · Issue #3772 · elastic/elasticsearch · GitHub

--
Ivan

On Tue, Jan 7, 2014 at 8:31 AM, Britta Weber <britta.weber@elasticsearch.com

wrote:

You could also use a script as described here:

Elasticsearch Platform — Find real-time answers at scale | Elastic

Cheers,
Britta

On Mon, Jan 6, 2014 at 2:13 AM, Ivan Brusic ivan@brusic.com wrote:

You could provide your own Similarity class as a plugin. Don't have any
sample code in front of me, but it would be based of TFIDFSimilarity and
you would basically needed to ignore the norms and other values.

TFIDFSimilarity (Lucene 4.6.0 API)

The IDF portion could probably remain since it ranks the different terms
in
your query, not the score of each term.

Cheers,

Ivan

On Sun, Jan 5, 2014 at 1:57 PM, Kevin S kevinsteger@gmail.com wrote:

I would like to score based entirely on term count.

For example, given the following two documents:

{ "apple" }

{ "apple apple" }

Searching "apple" ranks the first before the second. I wish to rank the
second, in which the term occurs twice, with a higher score.

Can someone please point me in the right direction for this?

Thank you.

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/1bb386ae-3ab5-4878-9d29-6462eaff14c7%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBwEy7UgdqYQmX3EuO71TwSAMCnDp7hdSkcvxLwH5jMJw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALhJbBiFtgJOfhBqXkS-%2B2YWnDy81j7c5jaSFEkG%3DVizqTpykg%40mail.gmail.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDAzNoZwdcquTqyB70Kpw4DSPSPZr2fe%3DCUbMORv1pbUQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

David_Zweigenhaft · April 21, 2014, 7:14am

I am new in Elasticsearch and I would like to score based entirely on term
count. I would like to know how you solved it.

Can you provide me your solution ?

Actually, I would like to count how many times a phrase repeats in a
document (for example the phrase- "apple apple"). Do you think it is
possible to use the term frequency for phrases counting ?.

I'm really stuck with this and need help.

Thanks you.

On Sunday, January 5, 2014 11:57:25 PM UTC+2, Kevin S wrote:

I would like to score based entirely on term count.

For example, given the following two documents:

{ "apple" }

{ "apple apple" }

Searching "apple" ranks the first before the second. I wish to rank the
second, in which the term occurs twice, with a higher score.

Can someone please point me in the right direction for this?

Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f4d40f0e-c25e-4c22-9c48-af23eb8794f4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Score based on phrase frequency only Elasticsearch	1	609	July 6, 2017
Scoring based on existence of all terms even if one term appears multiple times Elasticsearch	2	408	July 5, 2017
Word count score Elasticsearch	5	1400	July 6, 2017
Score based on Term Frequency alone Elasticsearch	2	3937	May 23, 2017
Calculation of whymatch in elasticsearch Elasticsearch	6	927	July 6, 2017

Score based on term frequency only

Related topics