Unable to configure porter stemm analyser

I'm looking at using Elasticsearch to provide the search functions of
our site.

I've been experimenting with it but am unable to enable the porterstem
analyser (so that a search for fight matches fights and fighting).

Here's a run down of my input.

curl -XPUT localhost:9200/local/ -d'
index :
    analysis :
        analyzer :
            stemming :
                type : custom
                tokenizer : standard
                filter : [standard, lowercase, stop, porterStem]
'

curl -XPUT localhost:9200/local/_mapping -d'{"properties":

{ "title" : { "analyzer" : "stemming", "type" : "string" }}}'

curl -XPUT localhost:9200/local/article/1 -d'{"title": "Fight for

your life"}'
curl -XPUT localhost:9200/local/article/2 -d'{"title": "Fighting
for your life"}'
curl -XPUT localhost:9200/local/article/3 -d'{"title": "My dad
fought a dog"}'
curl -XPUT localhost:9200/local/article/4 -d'{"title": "Bruno
fights Tyson tomorrow"}'

However running a search for 'fight' only matches the first entry -
the one that contains the exact term.

curl -XGET localhost:9200/local/_search?q=fight

The correct settings appear to have been set up but doesn't seem to
work.

  "indices" : {
    "local" : {
      "aliases" : [ ],
      "settings" : {
        "index.analysis.analyzer.stemming.type" : "custom",
        "index.analysis.analyzer.stemming.tokenizer" : "standard",
        "index.analysis.analyzer.stemming.filter.1" : "lowercase",
        "index.analysis.analyzer.stemming.filter.0" : "standard",
        "index.analysis.analyzer.stemming.filter.3" :

"porterStem",
"index.analysis.analyzer.stemming.filter.2" : "stop",
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1"
},

Anyone got this functionality up and running and able to point me in
the right direction?

Here is a gist with a sample of how ti works: gist:879883 · GitHub.

Few notes on what you do:

  1. The put mapping should be on index and type, PUT localhost:9200/local/article/_mapping.
  2. The body of put mapping should include the type as top level.
  3. By default, when searching without specifying on a specific field, the _all field is searched, and thats not stemmed based on the mappings, only title is stemmed.

-shay.banon
On Monday, March 21, 2011 at 6:27 PM, WeeJames wrote:

I'm looking at using Elasticsearch to provide the search functions of
our site.

I've been experimenting with it but am unable to enable the porterstem
analyser (so that a search for fight matches fights and fighting).

Here's a run down of my input.

curl -XPUT localhost:9200/local/ -d'
index :
analysis :
analyzer :
stemming :
type : custom
tokenizer : standard
filter : [standard, lowercase, stop, porterStem]
'

curl -XPUT localhost:9200/local/_mapping -d'{"properties":
{ "title" : { "analyzer" : "stemming", "type" : "string" }}}'

curl -XPUT localhost:9200/local/article/1 -d'{"title": "Fight for
your life"}'
curl -XPUT localhost:9200/local/article/2 -d'{"title": "Fighting
for your life"}'
curl -XPUT localhost:9200/local/article/3 -d'{"title": "My dad
fought a dog"}'
curl -XPUT localhost:9200/local/article/4 -d'{"title": "Bruno
fights Tyson tomorrow"}'

However running a search for 'fight' only matches the first entry -
the one that contains the exact term.

curl -XGET localhost:9200/local/_search?q=fight

The correct settings appear to have been set up but doesn't seem to
work.

"indices" : {
"local" : {
"aliases" : ,
"settings" : {
"index.analysis.analyzer.stemming.type" : "custom",
"index.analysis.analyzer.stemming.tokenizer" : "standard",
"index.analysis.analyzer.stemming.filter.1" : "lowercase",
"index.analysis.analyzer.stemming.filter.0" : "standard",
"index.analysis.analyzer.stemming.filter.3" :
"porterStem",
"index.analysis.analyzer.stemming.filter.2" : "stop",
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1"
},

Anyone got this functionality up and running and able to point me in
the right direction?

Hi,

I had a similar problem with the Snowball analyzer. With Shay's help I
was able to get it running. You can find the example here:

http://groups.google.com/a/elasticsearch.com/group/users/browse_thread/thread/b94f58fef95db90e/

On Mar 21, 5:27 pm, WeeJames weeja...@gmail.com wrote:

I'm looking at using Elasticsearch to provide the search functions of
our site.

I've been experimenting with it but am unable to enable the porterstem
analyser (so that a search for fight matches fights and fighting).

Here's a run down of my input.

curl -XPUT localhost:9200/local/ -d'
index :
    analysis :
        analyzer :
            stemming :
                type : custom
                tokenizer : standard
                filter : [standard, lowercase, stop, porterStem]
'

curl -XPUT localhost:9200/local/_mapping -d'{"properties":

{ "title" : { "analyzer" : "stemming", "type" : "string" }}}'

curl -XPUT localhost:9200/local/article/1 -d'{"title": "Fight for

your life"}'
curl -XPUT localhost:9200/local/article/2 -d'{"title": "Fighting
for your life"}'
curl -XPUT localhost:9200/local/article/3 -d'{"title": "My dad
fought a dog"}'
curl -XPUT localhost:9200/local/article/4 -d'{"title": "Bruno
fights Tyson tomorrow"}'

However running a search for 'fight' only matches the first entry -
the one that contains the exact term.

curl -XGET localhost:9200/local/_search?q=fight

The correct settings appear to have been set up but doesn't seem to
work.

  "indices" : {
    "local" : {
      "aliases" : [ ],
      "settings" : {
        "index.analysis.analyzer.stemming.type" : "custom",
        "index.analysis.analyzer.stemming.tokenizer" : "standard",
        "index.analysis.analyzer.stemming.filter.1" : "lowercase",
        "index.analysis.analyzer.stemming.filter.0" : "standard",
        "index.analysis.analyzer.stemming.filter.3" :

"porterStem",
"index.analysis.analyzer.stemming.filter.2" : "stop",
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1"
},

Anyone got this functionality up and running and able to point me in
the right direction?

Thanks. I've got it up and running now thanks to the gist. Is there
anyway to apply the index level mapping through a config file (json or
yaml)?

On Mar 21, 7:22 pm, Torsten admiralc...@gmail.com wrote:

Hi,

I had a similar problem with the Snowball analyzer. With Shay's help I
was able to get it running. You can find the example here:

http://groups.google.com/a/elasticsearch.com/group/users/browse_threa...https://gist.github.com/853994

On Mar 21, 5:27 pm, WeeJames weeja...@gmail.com wrote:

I'm looking at using Elasticsearch to provide the search functions of
our site.

I've been experimenting with it but am unable to enable the porterstem
analyser (so that a search for fight matches fights and fighting).

Here's a run down of my input.

curl -XPUT localhost:9200/local/ -d'
index :
    analysis :
        analyzer :
            stemming :
                type : custom
                tokenizer : standard
                filter : [standard, lowercase, stop, porterStem]
'
curl -XPUT localhost:9200/local/_mapping -d'{"properties":

{ "title" : { "analyzer" : "stemming", "type" : "string" }}}'

curl -XPUT localhost:9200/local/article/1 -d'{"title": "Fight for

your life"}'
curl -XPUT localhost:9200/local/article/2 -d'{"title": "Fighting
for your life"}'
curl -XPUT localhost:9200/local/article/3 -d'{"title": "My dad
fought a dog"}'
curl -XPUT localhost:9200/local/article/4 -d'{"title": "Bruno
fights Tyson tomorrow"}'

However running a search for 'fight' only matches the first entry -
the one that contains the exact term.

curl -XGET localhost:9200/local/_search?q=fight

The correct settings appear to have been set up but doesn't seem to
work.

  "indices" : {
    "local" : {
      "aliases" : [ ],
      "settings" : {
        "index.analysis.analyzer.stemming.type" : "custom",
        "index.analysis.analyzer.stemming.tokenizer" : "standard",
        "index.analysis.analyzer.stemming.filter.1" : "lowercase",
        "index.analysis.analyzer.stemming.filter.0" : "standard",
        "index.analysis.analyzer.stemming.filter.3" :

"porterStem",
"index.analysis.analyzer.stemming.filter.2" : "stop",
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1"
},

Anyone got this functionality up and running and able to point me in
the right direction?