Remove all stopwords


(David Squires) #1

Hello,

We're using the standard analyzer and it's not working out so well for our
purpose. We want to remove all of the stopwords, for our search
'the','and,'this' are actually important terms.

Is there anyway to do this using a filter? I've tried to re-index using
this:

curl -XPUT "localhost:9200/files/_settings" -d ' {
"index": {
"analysis" : {
"analyzer" : {
"standard" : {
"stopwords" : []
}
}
}
}
}'

This returned
{"ok":true}

But doesn't seem to be have any effect on our queries using:

{
query : {
field : {
name : the dog
}
}
}

Any suggestions would be great!

Thank you

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #2

Use another analyzer instead standard one.
See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 13 oct. 2013 à 20:52, David Squires dave@bluetopmedia.com a écrit :

Hello,

We're using the standard analyzer and it's not working out so well for our purpose. We want to remove all of the stopwords, for our search 'the','and,'this' are actually important terms.

Is there anyway to do this using a filter? I've tried to re-index using this:

curl -XPUT "localhost:9200/files/_settings" -d ' {
"index": {
"analysis" : {
"analyzer" : {
"standard" : {
"stopwords" : []
}
}
}
}
}'

This returned
{"ok":true}

But doesn't seem to be have any effect on our queries using:

{
query : {
field : {
name : the dog
}
}
}

Any suggestions would be great!

Thank you

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Squires) #3

Hi David,

Thank you for the reply. So, I've attempted at creating a custom analyzer,
but no luck. Here's what i've set:

curl -XPUT "localhost:9200/files/_settings" -d ' {
"index": {
"analysis" : {
"analyzer" : {
"noStopAnalyzer" : {
"type" : "custom",
"tokenizer" : "noStopTokenizer",
"filter" : [noStops]
}
},
"tokenizer" : {
"noStopTokenizer" : {
"type" : "standard"
}
},
"filter" : {
"noStops" : {
"type" : "stop",
"stopwords" : []
}
}
}
}
}'

I get an error:

Unrecognized token 'noStops': was expecting \n

Any ideas?

Thanks!

On Sunday, October 13, 2013 5:31:48 PM UTC-4, David Pilato wrote:

Use another analyzer instead standard one.
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 13 oct. 2013 à 20:52, David Squires <da...@bluetopmedia.com<javascript:>>
a écrit :

Hello,

We're using the standard analyzer and it's not working out so well for our
purpose. We want to remove all of the stopwords, for our search
'the','and,'this' are actually important terms.

Is there anyway to do this using a filter? I've tried to re-index using
this:

curl -XPUT "localhost:9200/files/_settings" -d ' {
"index": {
"analysis" : {
"analyzer" : {
"standard" : {
"stopwords" : []
}
}
}
}
}'

This returned
{"ok":true}

But doesn't seem to be have any effect on our queries using:

{
query : {
field : {
name : the dog
}
}
}

Any suggestions would be great!

Thank you

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Squires) #4

Ah, It looks like it just had to be in quotes.

I've tried with that and it parses, but then I get the error:

ElasticSearchIllegalArgumentException[Can't update non dynamic settings[

On Sunday, October 13, 2013 6:19:26 PM UTC-4, David Squires wrote:

Hi David,

Thank you for the reply. So, I've attempted at creating a custom
analyzer, but no luck. Here's what i've set:

curl -XPUT "localhost:9200/files/_settings" -d ' {
"index": {
"analysis" : {
"analyzer" : {
"noStopAnalyzer" : {
"type" : "custom",
"tokenizer" : "noStopTokenizer",
"filter" : [noStops]
}
},
"tokenizer" : {
"noStopTokenizer" : {
"type" : "standard"
}
},
"filter" : {
"noStops" : {
"type" : "stop",
"stopwords" : []
}
}
}
}
}'

I get an error:

Unrecognized token 'noStops': was expecting \n

Any ideas?

Thanks!

On Sunday, October 13, 2013 5:31:48 PM UTC-4, David Pilato wrote:

Use another analyzer instead standard one.
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 13 oct. 2013 à 20:52, David Squires da...@bluetopmedia.com a écrit :

Hello,

We're using the standard analyzer and it's not working out so well for
our purpose. We want to remove all of the stopwords, for our search
'the','and,'this' are actually important terms.

Is there anyway to do this using a filter? I've tried to re-index using
this:

curl -XPUT "localhost:9200/files/_settings" -d ' {
"index": {
"analysis" : {
"analyzer" : {
"standard" : {
"stopwords" : []
}
}
}
}
}'

This returned
{"ok":true}

But doesn't seem to be have any effect on our queries using:

{
query : {
field : {
name : the dog
}
}
}

Any suggestions would be great!

Thank you

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Squires) #5

Ok, so I found that I have to do this first:

curl -XPOST "localhost:9200/files/_close"

then I ran it and then re-opened it using:

curl -XPOST "localhost:9200/files/_open"

Is there anything else I would need to do? Do I just set {"filter" :
"noStops"} in my query?

Doesn't seem to be making a difference.

On Sunday, October 13, 2013 6:21:39 PM UTC-4, David Squires wrote:

Ah, It looks like it just had to be in quotes.

I've tried with that and it parses, but then I get the error:

ElasticSearchIllegalArgumentException[Can't update non dynamic settings[

On Sunday, October 13, 2013 6:19:26 PM UTC-4, David Squires wrote:

Hi David,

Thank you for the reply. So, I've attempted at creating a custom
analyzer, but no luck. Here's what i've set:

curl -XPUT "localhost:9200/files/_settings" -d ' {
"index": {
"analysis" : {
"analyzer" : {
"noStopAnalyzer" : {
"type" : "custom",
"tokenizer" : "noStopTokenizer",
"filter" : [noStops]
}
},
"tokenizer" : {
"noStopTokenizer" : {
"type" : "standard"
}
},
"filter" : {
"noStops" : {
"type" : "stop",
"stopwords" : []
}
}
}
}
}'

I get an error:

Unrecognized token 'noStops': was expecting \n

Any ideas?

Thanks!

On Sunday, October 13, 2013 5:31:48 PM UTC-4, David Pilato wrote:

Use another analyzer instead standard one.
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 13 oct. 2013 à 20:52, David Squires da...@bluetopmedia.com a
écrit :

Hello,

We're using the standard analyzer and it's not working out so well for
our purpose. We want to remove all of the stopwords, for our search
'the','and,'this' are actually important terms.

Is there anyway to do this using a filter? I've tried to re-index using
this:

curl -XPUT "localhost:9200/files/_settings" -d ' {
"index": {
"analysis" : {
"analyzer" : {
"standard" : {
"stopwords" : []
}
}
}
}
}'

This returned
{"ok":true}

But doesn't seem to be have any effect on our queries using:

{
query : {
field : {
name : the dog
}
}
}

Any suggestions would be great!

Thank you

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #6

Don't add the stopword filter and you're done.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 oct. 2013 à 00:32, David Squires dave@bluetopmedia.com a écrit :

Ok, so I found that I have to do this first:

curl -XPOST "localhost:9200/files/_close"

then I ran it and then re-opened it using:

curl -XPOST "localhost:9200/files/_open"

Is there anything else I would need to do? Do I just set {"filter" : "noStops"} in my query?

Doesn't seem to be making a difference.

On Sunday, October 13, 2013 6:21:39 PM UTC-4, David Squires wrote:

Ah, It looks like it just had to be in quotes.

I've tried with that and it parses, but then I get the error:

ElasticSearchIllegalArgumentException[Can't update non dynamic settings[

On Sunday, October 13, 2013 6:19:26 PM UTC-4, David Squires wrote:

Hi David,

Thank you for the reply. So, I've attempted at creating a custom analyzer, but no luck. Here's what i've set:

curl -XPUT "localhost:9200/files/_settings" -d ' {
"index": {
"analysis" : {
"analyzer" : {
"noStopAnalyzer" : {
"type" : "custom",
"tokenizer" : "noStopTokenizer",
"filter" : [noStops]
}
},
"tokenizer" : {
"noStopTokenizer" : {
"type" : "standard"
}
},
"filter" : {
"noStops" : {
"type" : "stop",
"stopwords" : []
}
}
}
}
}'

I get an error:

Unrecognized token 'noStops': was expecting \n

Any ideas?

Thanks!

On Sunday, October 13, 2013 5:31:48 PM UTC-4, David Pilato wrote:

Use another analyzer instead standard one.
See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 13 oct. 2013 à 20:52, David Squires da...@bluetopmedia.com a écrit :

Hello,

We're using the standard analyzer and it's not working out so well for our purpose. We want to remove all of the stopwords, for our search 'the','and,'this' are actually important terms.

Is there anyway to do this using a filter? I've tried to re-index using this:

curl -XPUT "localhost:9200/files/_settings" -d ' {
"index": {
"analysis" : {
"analyzer" : {
"standard" : {
"stopwords" : []
}
}
}
}
}'

This returned
{"ok":true}

But doesn't seem to be have any effect on our queries using:

{
query : {
field : {
name : the dog
}
}
}

Any suggestions would be great!

Thank you

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #7