Hi !
I tried to define a stopword list for my custom analyzer like this :
"analysis" : {
"tokenizer" : {
"host_tokenizer" : {
"type": "pattern",
"pattern": "[a-zA-Z0-9]+",
"group": 0
}
},
"analyzer" : {
"host_analyzer" : {
"type" : "custom",
"tokenizer" : "host_tokenizer",
"filter" : ["lowercase"],
"stopwords": ["www", "fr", "com"]
}
}
}
But when I do this, the "stopwords" line is ignored.
Apparently you need to do
"analysis" : {
"filter": {
"hostname_stop": {
"type": "stop",
"stopwords": ["www", "fr", "com"]
}
},
"tokenizer" : {
"host_tokenizer" : {
"type": "pattern",
"pattern": "[a-zA-Z0-9]+",
"group": 0
}
},
"analyzer" : {
"host_analyzer" : {
"type" : "custom",
"tokenizer" : "host_tokenizer",
"filter" : ["lowercase", "hostname_stop"],
}
}
}
This is confusing because the first syntax is closer with what you find in
the guide for standard analyzers
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/using-stopwords.html
And there is no error or warning telling me that my "stopwords":
["www", "fr", "com"]
line got ignored.
Gist :
Alix Martin
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d9ccb329-59e2-42af-9c86-6c11adeba2c0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.