Analyzer problem


(elyrank) #1

Hi,

I'm new to elasticsearch and still trying to understand how exactly all the
index & mapping definitions affect the search.

I defined an edgeNGram analyzer , so I could search for partial words,
but now I have a problem of too many data returning

when I search for "serial" I get results for serialize, serializable ,
serialed ,etc.
but I also get "service" which I don't want

I defined the analyzer as follows:

.startObject("test_filter_ngram")
.field("type, "edgeNGram")
.field("min_gram", 3)
.field("max_gram", 40)
.endObject()

is there any other way to configure it to avoid false positives , and still
be able to search partial words?

--
Thanks,
Elyran

--
This message may contain confidential and/or privileged information.
If you are not the addressee or authorized to receive this on behalf of the
addressee you must not use, copy, disclose or take action based on this
message or any information herein.
If you have received this message in error, please advise the sender
immediately by reply email and delete this message. Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #2

Are you using the same analyzer during both indexing and querying? You
should only apply ngrams at index time and search without apply ngrams to
your search terms.

Cheers,

Ivan

On Wed, Oct 2, 2013 at 4:01 AM, Elyran Kogan elyran@liveperson.com wrote:

Hi,

I'm new to elasticsearch and still trying to understand how exactly all
the index & mapping definitions affect the search.

I defined an edgeNGram analyzer , so I could search for partial words,
but now I have a problem of too many data returning

when I search for "serial" I get results for serialize, serializable ,
serialed ,etc.
but I also get "service" which I don't want

I defined the analyzer as follows:

.startObject("test_filter_ngram")
.field("type, "edgeNGram")
.field("min_gram", 3)
.field("max_gram", 40)
.endObject()

is there any other way to configure it to avoid false positives , and
still be able to search partial words?

--
Thanks,
Elyran

This message may contain confidential and/or privileged information.
If you are not the addressee or authorized to receive this on behalf of
the addressee you must not use, copy, disclose or take action based on this
message or any information herein.
If you have received this message in error, please advise the sender
immediately by reply email and delete this message. Thank you.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(elyrank) #3

that is exactly what I'm doing - I did not specify any analyzer at the query
only at index time.

On Thu, Oct 3, 2013 at 12:37 AM, Ivan Brusic ivan@brusic.com wrote:

Are you using the same analyzer during both indexing and querying? You
should only apply ngrams at index time and search without apply ngrams to
your search terms.

Cheers,

Ivan

On Wed, Oct 2, 2013 at 4:01 AM, Elyran Kogan elyran@liveperson.comwrote:

Hi,

I'm new to elasticsearch and still trying to understand how exactly all
the index & mapping definitions affect the search.

I defined an edgeNGram analyzer , so I could search for partial words,
but now I have a problem of too many data returning

when I search for "serial" I get results for serialize, serializable ,
serialed ,etc.
but I also get "service" which I don't want

I defined the analyzer as follows:

.startObject("test_filter_ngram")
.field("type, "edgeNGram")
.field("min_gram", 3)
.field("max_gram", 40)
.endObject()

is there any other way to configure it to avoid false positives , and
still be able to search partial words?

--
Thanks,
Elyran

This message may contain confidential and/or privileged information.
If you are not the addressee or authorized to receive this on behalf of
the addressee you must not use, copy, disclose or take action based on this
message or any information herein.
If you have received this message in error, please advise the sender
immediately by reply email and delete this message. Thank you.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Elyran Kogan LivePerson, Inc. Software Developer T +972 74 700 4387 F +972
74 700 4920 13 Zarchin Street PO Box 2067, Industrial Area Ra'anana 43100,
Israel Meaningful connections through intelligent engagement.™

--
This message may contain confidential and/or privileged information.
If you are not the addressee or authorized to receive this on behalf of the
addressee you must not use, copy, disclose or take action based on this
message or any information herein.
If you have received this message in error, please advise the sender
immediately by reply email and delete this message. Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4