Help - custom anaylzer almost works but not getting the results I want when searching _all


(NLS) #1

I got elastic search up and running and indexing but I do not want it to
tokenize the dash out. So I searched around and wrote a custom analyzer.
This works but I want it to be the default for all indices and types but
can't seem to get that going.

Anyway, I created a new index with command:

PUT http://myhost:9200/new_index
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"s2i_analyzer":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"word_filter",
"lowercase"
]
}
},
"filter":{
"word_filter":{
"type":"word_delimiter",
"generate_word_parts":"false",
"generate_number_parts":"false",
"split_on_numerics":"false",
"split_on_case_change":"false",
"preserve_original":"true"
}
}
}
}
}
}

That works, then I can successfully, set that analyzer on a type's field by
doing command
PUT http://myhost:9200/new_index/new_type/_mapping
{
"new_type":{
"properties":{
"alias":{
"type":"string",
"index_analyzer":"s2i_analyzer",
"search_analyzer":"s2i_analyzer"
},
"modelName":{
"type":"string",
"index_analyzer":"s2i_analyzer",
"search_analyzer":"s2i_analyzer"
},
"_all":{
"type":"string",
"index_analyzer":"s2i_analyzer",
"search_analyzer":"s2i_analyzer"
}
}
}
}
That works successfully.

I successfully query using query_string and get the correct results I
expect on fields modelName and alias.
HOWEVER, the same query but change the field name to _all, returns NO
RESULTS. I don't understand.
I have a modelName "Owner_at_1-1tenth_Boxes".

I can query with query_string Owner_at_1-1* and it works when using field
modelName but not _all.
I can query with query_string Owner* and it works for both field modelName
and _all

So how do I make the default analyzer for both index and search be the
s2i_analyzer for ALL FIELDS so the dash is handled the way we want.

I have been search the forum and web and found that to set the default, it
should be index.analysis.analyzer.default.type. So I tried many times to
add it to either the settings on index creation and
I tried to add it to the mapping on the type.

BUT I got errors or it did not work. So can you please try me whether I
should:

  1. It needs to be done on index creation or type mapping?
  2. What is the exact json that I add to my stuff?

Thanks in advance.

--


(Igor Motov) #2

If you want to use this analyzer for all fields, you can simply define it
as default analyzer:

curl -XPUT http://localhost:9200/new_index -d '{
"settings":{
"index":{
"analysis":{
"analyzer":{
"default":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"word_filter",
"lowercase"
]
}
},
"filter":{
"word_filter":{
"type":"word_delimiter",
"generate_word_parts":"false",
"generate_number_parts":"false",
"split_on_numerics":"false",
"split_on_case_change":"false",
"preserve_original":"true"
}
}
}
}
}
}'

This way, you will not have to apply it explicitly for every single field.

On Wednesday, October 24, 2012 11:54:55 AM UTC-4, njl wrote:

I got elastic search up and running and indexing but I do not want it to
tokenize the dash out. So I searched around and wrote a custom analyzer.
This works but I want it to be the default for all indices and types but
can't seem to get that going.

Anyway, I created a new index with command:

PUT http://myhost:9200/new_index
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"s2i_analyzer":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"word_filter",
"lowercase"
]
}
},
"filter":{
"word_filter":{
"type":"word_delimiter",
"generate_word_parts":"false",
"generate_number_parts":"false",
"split_on_numerics":"false",
"split_on_case_change":"false",
"preserve_original":"true"
}
}
}
}
}
}

That works, then I can successfully, set that analyzer on a type's field
by doing command
PUT http://myhost:9200/new_index/new_type/_mapping
{
"new_type":{
"properties":{
"alias":{
"type":"string",
"index_analyzer":"s2i_analyzer",
"search_analyzer":"s2i_analyzer"
},
"modelName":{
"type":"string",
"index_analyzer":"s2i_analyzer",
"search_analyzer":"s2i_analyzer"
},
"_all":{
"type":"string",
"index_analyzer":"s2i_analyzer",
"search_analyzer":"s2i_analyzer"
}
}
}
}
That works successfully.

I successfully query using query_string and get the correct results I
expect on fields modelName and alias.
HOWEVER, the same query but change the field name to _all, returns NO
RESULTS. I don't understand.
I have a modelName "Owner_at_1-1tenth_Boxes".

I can query with query_string Owner_at_1-1* and it works when using field
modelName but not _all.
I can query with query_string Owner* and it works for both field
modelName and _all

So how do I make the default analyzer for both index and search be the
s2i_analyzer for ALL FIELDS so the dash is handled the way we want.

I have been search the forum and web and found that to set the default, it
should be index.analysis.analyzer.default.type. So I tried many times to
add it to either the settings on index creation and
I tried to add it to the mapping on the type.

BUT I got errors or it did not work. So can you please try me whether I
should:

  1. It needs to be done on index creation or type mapping?
  2. What is the exact json that I add to my stuff?

Thanks in advance.

--


(NLS) #3

Thanks

From: elasticsearch@googlegroups.com
[mailto:elasticsearch@googlegroups.com] On Behalf Of Igor Motov
Sent: Wednesday, October 24, 2012 1:26 PM
To: elasticsearch@googlegroups.com
Subject: Re: Help - custom anaylzer almost works but not getting the
results I want when searching _all

If you want to use this analyzer for all fields, you can simply define
it as default analyzer:

curl -XPUT http://localhost:9200/new_index -d '{

"settings":{

  "index":{

     "analysis":{

        "analyzer":{

           "default":{

              "type":"custom",

              "tokenizer":"keyword",

              "filter":[

                 "word_filter",

                 "lowercase"

              ]

           }

        },

        "filter":{

           "word_filter":{

              "type":"word_delimiter",

              "generate_word_parts":"false",

              "generate_number_parts":"false",

              "split_on_numerics":"false",

              "split_on_case_change":"false",

              "preserve_original":"true"

           }

        }

     }

  }

}

}'

This way, you will not have to apply it explicitly for every single
field.

On Wednesday, October 24, 2012 11:54:55 AM UTC-4, njl wrote:

I got elastic search up and running and indexing but I do not want it to
tokenize the dash out. So I searched around and wrote a custom
analyzer.

This works but I want it to be the default for all indices and types but
can't seem to get that going.

Anyway, I created a new index with command:

PUT http://myhost:9200/new_index

{

"settings":{

  "index":{

     "analysis":{

        "analyzer":{

           "s2i_analyzer":{

              "type":"custom",

              "tokenizer":"keyword",

              "filter":[

                 "word_filter",

                 "lowercase"

              ]

           }

        },

        "filter":{

           "word_filter":{

              "type":"word_delimiter",

              "generate_word_parts":"false",

              "generate_number_parts":"false",

              "split_on_numerics":"false",

              "split_on_case_change":"false",

              "preserve_original":"true"

           }

        }

     }

  }

}

}

That works, then I can successfully, set that analyzer on a type's field
by doing command

PUT http://myhost:9200/new_index/new_type/_mapping

{

"new_type":{

  "properties":{

     "alias":{

        "type":"string",

        "index_analyzer":"s2i_analyzer",

        "search_analyzer":"s2i_analyzer"

     },

     "modelName":{

        "type":"string",

        "index_analyzer":"s2i_analyzer",

        "search_analyzer":"s2i_analyzer"

     },

     "_all":{

        "type":"string",

        "index_analyzer":"s2i_analyzer",

        "search_analyzer":"s2i_analyzer"

     }

  }

}

}

That works successfully.

I successfully query using query_string and get the correct results I
expect on fields modelName and alias.

HOWEVER, the same query but change the field name to _all, returns NO
RESULTS. I don't understand.
I have a modelName "Owner_at_1-1tenth_Boxes".

I can query with query_string Owner_at_1-1* and it works when using
field modelName but not _all.

I can query with query_string Owner* and it works for both field
modelName and _all

So how do I make the default analyzer for both index and search be the
s2i_analyzer for ALL FIELDS so the dash is handled the way we want.

I have been search the forum and web and found that to set the default,
it should be index.analysis.analyzer.default.type. So I tried many
times to add it to either the settings on index creation and

I tried to add it to the mapping on the type.

BUT I got errors or it did not work. So can you please try me whether
I should:

  1. It needs to be done on index creation or type mapping?

  2. What is the exact json that I add to my stuff?

Thanks in advance.

--

--


(system) #4