Boosting score relative to location

Hey all,

I have this use case where I am indexing names of people (for a phonebook
kind of app) and I try to autocomplete based on name and number, I am doing
edge anagrams and it's working fine, the number autocomplete is also
working fine.

The result order for the name autocomplete however needs tweaking a bit.

Say we have a name like:

Barack Hussein Obama

I want a person typing "Barack O" to see that as the first result. What is
happening now is that some of the top results could be

  • Obama Barack Etc..

  • SomeOtherFirstName Barack Obama SomeOther last name

I want to boost results that have the entered tokens in the very first
position and the very last position more i.e. First and last names matching
should have priority, is this possible?

Thanks in advance.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Guys, any ideas?

On Wednesday, August 28, 2013 12:34:06 PM UTC+2, Mo wrote:

Hey all,

I have this use case where I am indexing names of people (for a phonebook
kind of app) and I try to autocomplete based on name and number, I am doing
edge anagrams and it's working fine, the number autocomplete is also
working fine.

The result order for the name autocomplete however needs tweaking a bit.

Say we have a name like:

Barack Hussein Obama

I want a person typing "Barack O" to see that as the first result. What is
happening now is that some of the top results could be

  • Obama Barack Etc..

  • SomeOtherFirstName Barack Obama SomeOther last name

I want to boost results that have the entered tokens in the very first
position and the very last position more i.e. First and last names matching
should have priority, is this possible?

Thanks in advance.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I think I would tackle this at the indexing level. Include another field
(or multi-field) which extracts just the first and last word. You could do
this with the Pattern Tokenizer, and use a regex like:

^(\w+?)\b.*\b(\w+?)$

That will extract the first and last word and ignore everything in between.
Now you have a field that contains just two tokens, and you can combine
your previous query with another query that searches this new field (using
a Boolean). If you boost the new field you'll naturally sort "first/last"
matches to the top.

On Friday, August 30, 2013 7:46:01 AM UTC-4, Mo wrote:

Guys, any ideas?

On Wednesday, August 28, 2013 12:34:06 PM UTC+2, Mo wrote:

Hey all,

I have this use case where I am indexing names of people (for a phonebook
kind of app) and I try to autocomplete based on name and number, I am doing
edge anagrams and it's working fine, the number autocomplete is also
working fine.

The result order for the name autocomplete however needs tweaking a bit.

Say we have a name like:

Barack Hussein Obama

I want a person typing "Barack O" to see that as the first result. What
is happening now is that some of the top results could be

  • Obama Barack Etc..

  • SomeOtherFirstName Barack Obama SomeOther last name

I want to boost results that have the entered tokens in the very first
position and the very last position more i.e. First and last names matching
should have priority, is this possible?

Thanks in advance.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I think it would be nice if you could extract the first and last name and
the concatenate them into a single string that is stored in one field
(firstNameLastName).

Then at query time you query against that field alongside other fields
(like fullName).

Assign a higher importance to the firstNameLastName field and a lower
importance to the fullName field.

This can be done with the boolean query

http://www.elasticsearch.org/guide/reference/query-dsl/bool-query/

You might need a custom analyzer (Token Filter) that automates the
extraction of the first and last names and then combines them into one
string within ES.

So the following original tokens will change as follows after being
processed by the Token Filter

George Walker Bush => George Bush
George Herbert Walker Bush => George Bush
Barack Hussein Obama => Barack Obama

Assuming there are no titles, salutations and suffixes in the name.

You can just split the original string into an array using the space
character as a delimiter and then concatenate the first and last elements
of the array into a string.

Checkout how the ReverseTokenFilter is implemented for some ideas:

http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/
https://raw.github.com/elasticsearch/elasticsearch/c93babed42f1020f7e348808bec4182fc109c030/src/main/java/org/elasticsearch/index/analysis/ReverseTokenFilterFactory.java
http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_4_0/lucene/analysis/common/src/java/org/apache/lucene/analysis/reverse/ReverseStringFilter.java

Author and Instructor for the Upcoming Book and Lecture Series
Massive Log Data Aggregation, Processing, Searching and Visualization with
Open Source Software

http://massivelogdata.com

On Wed, Aug 28, 2013 at 6:34 AM, Mo mohammady.mahdy@gmail.com wrote:

Hey all,

I have this use case where I am indexing names of people (for a phonebook
kind of app) and I try to autocomplete based on name and number, I am doing
edge anagrams and it's working fine, the number autocomplete is also
working fine.

The result order for the name autocomplete however needs tweaking a bit.

Say we have a name like:

Barack Hussein Obama

I want a person typing "Barack O" to see that as the first result. What is
happening now is that some of the top results could be

  • Obama Barack Etc..

  • SomeOtherFirstName Barack Obama SomeOther last name

I want to boost results that have the entered tokens in the very first
position and the very last position more i.e. First and last names matching
should have priority, is this possible?

Thanks in advance.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Perfect! thanks!

On Fri, Aug 30, 2013 at 4:38 PM, Zachary Tong zacharyjtong@gmail.comwrote:

I think I would tackle this at the indexing level. Include another field
(or multi-field) which extracts just the first and last word. You could do
this with the Pattern Tokenizer, and use a regex like:

^(\w+?)\b.*\b(\w+?)$

That will extract the first and last word and ignore everything in
between. Now you have a field that contains just two tokens, and you can
combine your previous query with another query that searches this new field
(using a Boolean). If you boost the new field you'll naturally sort
"first/last" matches to the top.

On Friday, August 30, 2013 7:46:01 AM UTC-4, Mo wrote:

Guys, any ideas?

On Wednesday, August 28, 2013 12:34:06 PM UTC+2, Mo wrote:

Hey all,

I have this use case where I am indexing names of people (for a
phonebook kind of app) and I try to autocomplete based on name and number,
I am doing edge anagrams and it's working fine, the number autocomplete is
also working fine.

The result order for the name autocomplete however needs tweaking a bit.

Say we have a name like:

Barack Hussein Obama

I want a person typing "Barack O" to see that as the first result. What
is happening now is that some of the top results could be

  • Obama Barack Etc..

  • SomeOtherFirstName Barack Obama SomeOther last name

I want to boost results that have the entered tokens in the very first
position and the very last position more i.e. First and last names matching
should have priority, is this possible?

Thanks in advance.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/jpV4ioCTPpk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Many thanks, this is really helpful.

On Fri, Aug 30, 2013 at 5:00 PM, Israel Ekpo israel@aicer.org wrote:

I think it would be nice if you could extract the first and last name and
the concatenate them into a single string that is stored in one field
(firstNameLastName).

Then at query time you query against that field alongside other fields
(like fullName).

Assign a higher importance to the firstNameLastName field and a lower
importance to the fullName field.

This can be done with the boolean query

http://www.elasticsearch.org/guide/reference/query-dsl/bool-query/

You might need a custom analyzer (Token Filter) that automates the
extraction of the first and last names and then combines them into one
string within ES.

So the following original tokens will change as follows after being
processed by the Token Filter

George Walker Bush => George Bush
George Herbert Walker Bush => George Bush
Barack Hussein Obama => Barack Obama

Assuming there are no titles, salutations and suffixes in the name.

You can just split the original string into an array using the space
character as a delimiter and then concatenate the first and last elements
of the array into a string.

Checkout how the ReverseTokenFilter is implemented for some ideas:

http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/

https://raw.github.com/elasticsearch/elasticsearch/c93babed42f1020f7e348808bec4182fc109c030/src/main/java/org/elasticsearch/index/analysis/ReverseTokenFilterFactory.java

http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_4_0/lucene/analysis/common/src/java/org/apache/lucene/analysis/reverse/ReverseStringFilter.java

Author and Instructor for the Upcoming Book and Lecture Series
Massive Log Data Aggregation, Processing, Searching and Visualization
with Open Source Software

http://massivelogdata.com

On Wed, Aug 28, 2013 at 6:34 AM, Mo mohammady.mahdy@gmail.com wrote:

Hey all,

I have this use case where I am indexing names of people (for a phonebook
kind of app) and I try to autocomplete based on name and number, I am doing
edge anagrams and it's working fine, the number autocomplete is also
working fine.

The result order for the name autocomplete however needs tweaking a bit.

Say we have a name like:

Barack Hussein Obama

I want a person typing "Barack O" to see that as the first result. What
is happening now is that some of the top results could be

  • Obama Barack Etc..

  • SomeOtherFirstName Barack Obama SomeOther last name

I want to boost results that have the entered tokens in the very first
position and the very last position more i.e. First and last names matching
should have priority, is this possible?

Thanks in advance.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/jpV4ioCTPpk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.