paul1
(paul)
January 22, 2014, 11:17am
1
My mapping looks as below
"autocomplete_index":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"lowercase",
"syns_filter",
"my_edgeNgram"
]
}
Now when i analyze the configuration using analyze api the word after space
gets omitted . ie "university" is omitted
................../universityindextest2/_analyze?analyzer=autocomplete_index&text=yale%20university&pretty
output
{ "tokens" : [ { "token" : "ya", "start_offset" : 0, "end_offset" : 15,"type" : "word","position" : 1}, {"token" : "yal","start_offset" : 0,"end_offset" : 15,"type" : "word","position" : 2}, {"token" : "yale","start_offset" : 0,"end_offset" : 15,"type" : "word","position" : 3}, {"token" : "yu","start_offset" : 0,"end_offset" : 15,"type" : "word","position" : 4} ]
}
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d6bd7caa-b160-42ac-948c-6aab6884a51d%40googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
Binh_Ly
(Binh Ly)
January 22, 2014, 5:30pm
2
Paul, Is it possible that your "syns_filter" is affecting your ngram
filter? What happens when you remove the syns_filter?
On Wednesday, January 22, 2014 6:17:12 AM UTC-5, paul wrote:
My mapping looks as below
"autocomplete_index":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"lowercase",
"syns_filter",
"my_edgeNgram"
]
}
Now when i analyze the configuration using analyze api the word after
space gets omitted . ie "university" is omitted
................../universityindextest2/_analyze?analyzer=autocomplete_index&text=yale%20university&pretty
output
{ "tokens" : [ { "token" : "ya", "start_offset" : 0, "end_offset" : 15,"type" : "word","position" : 1}, {"token" : "yal","start_offset" : 0,"end_offset" : 15,"type" : "word","position" : 2}, {"token" : "yale","start_offset" : 0,"end_offset" : 15,"type" : "word","position" : 3}, {"token" : "yu","start_offset" : 0,"end_offset" : 15,"type" : "word","position" : 4} ]
}
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/423e6c0f-0aa2-4f48-a357-a313905fb8c0%40googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
paul1
(paul)
January 24, 2014, 5:03am
3
Binh , When i removed the syns_filter its still the same but when i changed
the "tokenizer":"keyword", to "whitespcae" it taking "university"
into account. May be its a tokenizer problem , when there is a space the
keyword tokenizer is omitting the word after space.
-paul
On Wed, Jan 22, 2014 at 11:00 PM, Binh Ly binh@hibalo.com wrote:
Paul, Is it possible that your "syns_filter" is affecting your ngram
filter? What happens when you remove the syns_filter?
On Wednesday, January 22, 2014 6:17:12 AM UTC-5, paul wrote:
My mapping looks as below
"autocomplete_index":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"lowercase",
"syns_filter",
"my_edgeNgram"
]
}
Now when i analyze the configuration using analyze api the word after
space gets omitted . ie "university" is omitted
................../universityindextest2/_analyze?
analyzer=autocomplete_index&text=yale%20university&pretty
output
{ "tokens" : [ { "token" : "ya", "start_offset" : 0, "end_offset" : 15,"type" : "word","position" : 1}, {"token" : "yal","start_offset" : 0,"end_offset" : 15,"type" : "word","position" : 2}, {"token" : "yale","start_offset" : 0,"end_offset" : 15,"type" : "word","position" : 3}, {"token" : "yu","start_offset" : 0,"end_offset" : 15,"type" : "word","position" : 4} ]
}
--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/inRyvJJDPpo/unsubscribe .
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/423e6c0f-0aa2-4f48-a357-a313905fb8c0%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out .
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO066G0Y%2BAoVt%2BN6q1bxr8KFN2A686U2Cp%3DyyEoHT_s41_vbzg%40mail.gmail.com .
For more options, visit https://groups.google.com/groups/opt_out .
Binh_Ly
(Binh Ly)
January 24, 2014, 4:57pm
4
Paul, yes you are correct, I missed that. The keyword tokenizer will take
your entire string and make it into a single token - that's why it is not
ngramming "university".
On Friday, January 24, 2014 12:03:34 AM UTC-5, paul wrote:
Binh , When i removed the syns_filter its still the same but when i
changed the "tokenizer":"keyword", to "whitespcae" it taking
"university" into account. May be its a tokenizer problem , when there is a
space the keyword tokenizer is omitting the word after space.
-paul
On Wed, Jan 22, 2014 at 11:00 PM, Binh Ly <bi...@hibalo.com <javascript:>>wrote:
Paul, Is it possible that your "syns_filter" is affecting your ngram
filter? What happens when you remove the syns_filter?
On Wednesday, January 22, 2014 6:17:12 AM UTC-5, paul wrote:
My mapping looks as below
"autocomplete_index":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"lowercase",
"syns_filter",
"my_edgeNgram"
]
}
Now when i analyze the configuration using analyze api the word after
space gets omitted . ie "university" is omitted
................../universityindextest2/_analyze?
analyzer=autocomplete_index&text=yale%20university&pretty
output
{ "tokens" : [ { "token" : "ya", "start_offset" : 0, "end_offset" : 15,"type" : "word", "position"
: 1 }, { "token" : "yal", "start_offset" : 0, "end_offset" : 15, "type"
: "word", "position" : 2 }, { "token" : "yale", "start_offset" : 0,"end_offset" : 15, "type"
: "word", "position" : 3 }, { "token" : "yu", "start_offset" : 0,"end_offset" : 15,"type" : "word", "position"
: 4 } ]}
--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/inRyvJJDPpo/unsubscribe .
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/423e6c0f-0aa2-4f48-a357-a313905fb8c0%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out .
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0bc9516f-1830-4f70-a25b-276a9b43ddac%40googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .