Stop words not indexed (custom edgeNGram analyzer with not STOP filter)

Robin_Boutros · May 7, 2012, 8:04pm

Hey,

That's how I define my mapping with Tire:

settings :analysis => {
:filter => {
:title_ngram => {
type: "edgeNGram",
side: "front",
max_gram: 15,
min_gram: 1
}
},
tokenizer: {
title_tokenizer: {
pattern: "[^\p{L}\d]+",
type: "pattern"
}
},
:analyzer => {
:ngram_analyzer => {
tokenizer: "title_tokenizer",
filter: ["lowercase", "title_ngram"],
type: "custom"
}
}
} do
mapping do
indexes :id, type: 'integer'
indexes :title, type: 'string', analyzer: 'snowball'
indexes :description, type: 'string', analyzer: 'snowball'
indexes :small_photo, index: :not_analyzed
indexes :ngram_title, :type => 'string', :index_analyzer =>
"ngram_analyzer", search_analyzer: "standard"
end
end

I have 2 documents: "The Tree" and "Isnogood".

When I search for "th", "The Tree" is found. With "The", it's not.
When I search for "i", "Isnogood" is found. With "is", it's not.

Why arent stop words indexed? There is no "stop" filter for the
ngram_analyzer...

Thanks!

kimchy · May 9, 2012, 9:19am

Your search analyzer is the standard analyzer, so it will run on your text
you provide as the query, and remove stopwords from it.

On Mon, May 7, 2012 at 11:04 PM, Robin Boutros niuage@gmail.com wrote:

Hey,

That's how I define my mapping with Tire:

settings :analysis => {
:filter => {
:title_ngram => {
type: "edgeNGram",
side: "front",
max_gram: 15,
min_gram: 1
}
},
tokenizer: {
title_tokenizer: {
pattern: "[^\p{L}\d]+",
type: "pattern"
}
},
:analyzer => {
:ngram_analyzer => {
tokenizer: "title_tokenizer",
filter: ["lowercase", "title_ngram"],
type: "custom"
}
}
} do
mapping do
indexes :id, type: 'integer'
indexes :title, type: 'string', analyzer: 'snowball'
indexes :description, type: 'string', analyzer: 'snowball'
indexes :small_photo, index: :not_analyzed
indexes :ngram_title, :type => 'string', :index_analyzer =>
"ngram_analyzer", search_analyzer: "standard"
end
end

I have 2 documents: "The Tree" and "Isnogood".

When I search for "th", "The Tree" is found. With "The", it's not.
When I search for "i", "Isnogood" is found. With "is", it's not.

Why arent stop words indexed? There is no "stop" filter for the
ngram_analyzer...

Thanks!

Topic		Replies	Views
Stop-Words analyzers does not work as expected Elasticsearch	1	397	June 5, 2018
Custom stopwords not working with custom tokenizer Elasticsearch	1	579	May 2, 2017
Stopwords are not working in custom tokenizer Elasticsearch	3	391	April 29, 2021
How to index stop words AND special characters using standard analyzer Elasticsearch	2	1668	July 6, 2017
Problem understanding phrase matching with stop words Elasticsearch	3	1287	September 21, 2017

Stop words not indexed (custom edgeNGram analyzer with not STOP filter)

Related topics