What is required for partial match to work?

Raul_Jr_Martinez · July 4, 2011, 7:54am

Hello,

I'm pretty much new to ElasticSearch and my question is on partial
match and is somehow related to an older post and figured out that the
old thread didn't contain answers I was looking for:
http://groups.google.com/a/elasticsearch.com/group/users/browse_thread/thread/57f551b0897bf55c/be5276d04f0ba1f5?lnk=gst&q=Partial+Search#be5276d04f0ba1f5

Anyway, I have one document with title containing the word
"ULTRALIGHT". I want to make sure that when I search for "ULTRA" or
"LIGHT", the said document should be included in the result set.

I am using query_string when searching. How do I go about making sure
that I get this result? Should I be using Fuzzy or FLT?

Thanks,

Raul

Paul_Loy · July 4, 2011, 10:02am

Depends on what your data is going to be like. Are these real words, or
usernames?

You can use n-gram, but be careful as depending on your values for n you can
get lots of matches that may seem unrelated. I use n-gram for usernames.

On Mon, Jul 4, 2011 at 8:54 AM, rmartinez juneym@gmail.com wrote:

Hello,

I'm pretty much new to Elasticsearch and my question is on partial
match and is somehow related to an older post and figured out that the
old thread didn't contain answers I was looking for:

http://groups.google.com/a/elasticsearch.com/group/users/browse_thread/thread/57f551b0897bf55c/be5276d04f0ba1f5?lnk=gst&q=Partial+Search#be5276d04f0ba1f5

Anyway, I have one document with title containing the word
"ULTRALIGHT". I want to make sure that when I search for "ULTRA" or
"LIGHT", the said document should be included in the result set.

I am using query_string when searching. How do I go about making sure
that I get this result? Should I be using Fuzzy or FLT?

Thanks,

Raul

--

Paul Loy
paul@keteracel.com
http://uk.linkedin.com/in/paulloy

Raul_Jr_Martinez · July 5, 2011, 3:32pm

Hi Paul,

The is in the form of Articles and free-form text like classified ads.

I will try N-gram and see if it works for me. By the way, I used FLT
but it seems that I need to actually investigate why some "unrelated"
documents are matching... maybe it's too fuzzy

Regards,
Raul

On Jul 4, 6:02 pm, Paul Loy ketera...@gmail.com wrote:

Depends on what your data is going to be like. Are these real words, or
usernames?

You can use n-gram, but be careful as depending on your values for n you can
get lots of matches that may seem unrelated. I use n-gram for usernames.

On Mon, Jul 4, 2011 at 8:54 AM, rmartinez jun...@gmail.com wrote:

Hello,

I'm pretty much new to Elasticsearch and my question is on partial
match and is somehow related to an older post and figured out that the
old thread didn't contain answers I was looking for:

http://groups.google.com/a/elasticsearch.com/group/users/browse_threa...

Anyway, I have one document with title containing the word
"ULTRALIGHT". I want to make sure that when I search for "ULTRA" or
"LIGHT", the said document should be included in the result set.

I am using query_string when searching. How do I go about making sure
that I get this result? Should I be using Fuzzy or FLT?

Thanks,

Raul

--

Paul Loy
p...@keteracel.comhttp://uk.linkedin.com/in/paulloy

Paul_Loy · July 6, 2011, 9:31am

N-gram will probably be worse for matching unrelated docs...

On Tue, Jul 5, 2011 at 4:32 PM, rmartinez juneym@gmail.com wrote:

Hi Paul,

The is in the form of Articles and free-form text like classified ads.

I will try N-gram and see if it works for me. By the way, I used FLT
but it seems that I need to actually investigate why some "unrelated"
documents are matching... maybe it's too fuzzy

Regards,
Raul

On Jul 4, 6:02 pm, Paul Loy ketera...@gmail.com wrote:

Depends on what your data is going to be like. Are these real words, or
usernames?

You can use n-gram, but be careful as depending on your values for n you
can
get lots of matches that may seem unrelated. I use n-gram for usernames.

On Mon, Jul 4, 2011 at 8:54 AM, rmartinez jun...@gmail.com wrote:

Hello,

I'm pretty much new to Elasticsearch and my question is on partial
match and is somehow related to an older post and figured out that the
old thread didn't contain answers I was looking for:

http://groups.google.com/a/elasticsearch.com/group/users/browse_threa.
..

Anyway, I have one document with title containing the word
"ULTRALIGHT". I want to make sure that when I search for "ULTRA" or
"LIGHT", the said document should be included in the result set.

I am using query_string when searching. How do I go about making sure
that I get this result? Should I be using Fuzzy or FLT?

Thanks,

Raul

--

Paul Loy
p...@keteracel.comhttp://uk.linkedin.com/in/paulloy

--

Paul Loy
paul@keteracel.com
http://uk.linkedin.com/in/paulloy

Clinton_Gormley · July 6, 2011, 9:51am

On Wed, 2011-07-06 at 10:31 +0100, Paul Loy wrote:

N-gram will probably be worse for matching unrelated docs...

Ngrams should usually only be used in the index_analyzer, not the
search_analyzer, to improve result relevancy.

For instance:

We index the text "apple" with an edge ngram analyzer, and get:

a,ap,app,appl,apple

When we analyze the search text, we can do it with ngrams or without
ngrams. This is what you would see:

The user wants to search for "application", and starts typing:

           no-ngram         ngram
--------------------------------------------
a          match            match
ap         match            match
app        match            match
appl       match            match
appli      no match         match*
applic     no match         match*

Those last two results show why you usually don't want to use the ngram
version at search time.

To achieve this, when you specify the mapping for a field, you can do:

{ my_content: {
type: "string",
index_analyzer: "my_ngram_analyzer",
search_analyzer: "default"
}}

clint

Raul_Jr_Martinez · July 8, 2011, 12:32am

THis works for me.

gist.github.com

https://gist.github.com/juneym/1070860

elasticsearch.yml




index:
    refresh_interval: 2
    analysis:
          analyzer:
                ascAnalyzer1:
                  type: custom
                  tokenizer: "standard"

This file has been truncated. show original

{ content: {
type: "string",
index_analyzer: "ascAnalyzer1",
search_analyzer: "default"
}}

On Jul 6, 5:51 pm, Clinton Gormley clin...@iannounce.co.uk wrote:

On Wed, 2011-07-06 at 10:31 +0100, Paul Loy wrote:

N-gram will probably be worse for matching unrelated docs...

Ngrams should usually only be used in the index_analyzer, not the
search_analyzer, to improve result relevancy.

For instance:

We index the text "apple" with an edge ngram analyzer, and get:

a,ap,app,appl,apple

When we analyze the search text, we can do it with ngrams or without
ngrams. This is what you would see:

The user wants to search for "application", and starts typing:
           no-ngram         ngram
--------------------------------------------
a          match            match
ap         match            match
app        match            match
appl       match            match
appli      no match         match*
applic     no match         match*
Those last two results show why you usually don't want to use the ngram
version at search time.

To achieve this, when you specify the mapping for a field, you can do:

{ my_content: {
type: "string",
index_analyzer: "my_ngram_analyzer",
search_analyzer: "default"
}}

clint

Topic		Replies	Views
Partial phrase or exact phrase matching Elasticsearch	10	7299	August 20, 2020
Improving query scoring on partial matches Elasticsearch	1	380	April 26, 2019
Partial List Of Words Match in documents Elasticsearch	2	894	July 5, 2017
Assistance with a proper searching technique Elasticsearch	3	377	August 29, 2019
ElasticSearch query for most relevant documents Elasticsearch	2	327	August 28, 2018

What is required for partial match to work?

--

--

--

--

Related topics