Add extra stopwords


(Khoa Nguyen) #1

Environment & setup

Mac OSX
ES 0.90.7 installed via homebrew

Steps

update config

/usr/local/Cellar/elasticsearch/0.90.7/config/elasticsearch.yml

add more Stopwords to default standard analyzer

index:
analysis:
analyzer:
standard:
type: standard
stopwords: [http, t.co]

restart ES

curl -XGET 'localhost:9200/_analyze?analyzer=standard&pretty' -d 'this is a
test http'

=> {

"tokens" : [ {

"token" : "test",

"start_offset" : 10,

"end_offset" : 14,

"type" : "<ALPHANUM>",

"position" : 4

}, {

"token" : "http",

"start_offset" : 15,

"end_offset" : 19,

"type" : "<ALPHANUM>",

"position" : 5

} ]

}

Expectation

http shouldn't not be indexed

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/defed7ba-bcb9-4112-9e85-4957d0a9e34c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Khoa Nguyen) #2

whole purpose is to remove stopword from appearing in term facets

On Wednesday, 15 January 2014 19:10:49 UTC+11, Khoa Nguyen wrote:

Environment & setup

Mac OSX
ES 0.90.7 installed via homebrew

Steps

update config

/usr/local/Cellar/elasticsearch/0.90.7/config/elasticsearch.yml

add more Stopwords to default standard analyzer

index:
analysis:
analyzer:
standard:
type: standard
stopwords: [http, t.co]

restart ES

curl -XGET 'localhost:9200/_analyze?analyzer=standard&pretty' -d 'this is
a test http'

=> {

"tokens" : [ {

"token" : "test",

"start_offset" : 10,

"end_offset" : 14,

"type" : "<ALPHANUM>",

"position" : 4

}, {

"token" : "http",

"start_offset" : 15,

"end_offset" : 19,

"type" : "<ALPHANUM>",

"position" : 5

} ]

}

Expectation

http shouldn't not be indexed

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/315e5894-8501-4d4c-9523-7eb049893930%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Alexander Reelsen) #3

Hey,

try changing

index:
analysis:
analyzer:
standard:

to

index:
analysis:
analyzer:
default:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-analyzers.html#default-analyzers

--Alex

On Wed, Jan 15, 2014 at 9:12 AM, Khoa Nguyen huu.khoa.nguyen@gmail.comwrote:

whole purpose is to remove stopword from appearing in term facets

On Wednesday, 15 January 2014 19:10:49 UTC+11, Khoa Nguyen wrote:

Environment & setup

Mac OSX
ES 0.90.7 installed via homebrew

Steps

update config

/usr/local/Cellar/elasticsearch/0.90.7/config/elasticsearch.yml

add more Stopwords to default standard analyzer

index:
analysis:
analyzer:
standard:
type: standard
stopwords: [http, t.co]

restart ES

curl -XGET 'localhost:9200/_analyze?analyzer=standard&pretty' -d 'this
is a test http'

=> {

"tokens" : [ {

"token" : "test",

"start_offset" : 10,

"end_offset" : 14,

"type" : "<ALPHANUM>",

"position" : 4

}, {

"token" : "http",

"start_offset" : 15,

"end_offset" : 19,

"type" : "<ALPHANUM>",

"position" : 5

} ]

}

Expectation

http shouldn't not be indexed

Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/315e5894-8501-4d4c-9523-7eb049893930%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_CVF_KP861_vSQPswYeG7ty-V9xiV8Age%3DvKojeM_9Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4