JSON - Analyzer configuration


(Diego Sosa) #1

Hello everyone,

I am trying to create the mapping for my first object. It has two
properties for now: an id and a name. The ID is not searchable but the
name is and I would like to use the snowball analyzer configured with
spanish language. How should I set that in json?

{
"possible_client": {
"properties" : {
"id" : {
"type" : "long",
"store" : "yes",
"index" : "not_analyzed"
},
"name" : {
"type" : "string",
"store" : "yes",
"index" : "analyzed",
"analyzer" : "spanish" <--- How to set and configure different
analyzers here?
}
}
}
}

Thanks in advance.


(Nick Dunn) #2

Agustin,

My understanding is that you can configure custom analysers with a
configuration file, which you either place on your server in the
elasticsearch directory (which it reads when you restart the instance), or
pass your configuration when you create the index via the API. I use the
second method. Here's an example JSON file I use to configure an index:

It creates two custom analysers named "symphony_fulltext" and
"symphony_autocomplete", which use an array of filters. Note that the
"symphony_fulltext" analyser uses two custom filters named
"custom_synonyms" and "custom_stop", which are defined underneath.

I sent this JSON document when creating the index for the first time, but
you can use the update API to send changes to the index after creation
(just be sure to use the open/close API first, else it won't work).

So it sounds to me like you want to create your own custom analyser, call
it "spanish" and configure your tokenisers and filters there.

N.

On Monday, May 21, 2012 6:28:22 PM UTC+1, Agustin Lopez wrote:

Hello everyone,

I am trying to create the mapping for my first object. It has two
properties for now: an id and a name. The ID is not searchable but the
name is and I would like to use the snowball analyzer configured with
spanish language. How should I set that in json?

{
"possible_client": {
"properties" : {
"id" : {
"type" : "long",
"store" : "yes",
"index" : "not_analyzed"
},
"name" : {
"type" : "string",
"store" : "yes",
"index" : "analyzed",
"analyzer" : "spanish" <--- How to set
and configure different
analyzers here?
}
}
}
}

Thanks in advance.


(Agustin Lopez) #3

Nick,

Thank you for your reply. Do you know where can I find some documentation about placing custom analyzers with a configuration file you place on the server?
Does this impact every index? On the other hand, I understand what you did. I was trying to do the same thing but from inside the mapping. I guess that needs to be done on the index level.

I'll give it a try as you said, but if you know where I can find information about the configuration file that would be ideal.

Thanks.

On May 21, 2012, at 3:55 PM, Nick Dunn wrote:

Agustin,

My understanding is that you can configure custom analysers with a configuration file, which you either place on your server in the elasticsearch directory (which it reads when you restart the instance), or pass your configuration when you create the index via the API. I use the second method. Here's an example JSON file I use to configure an index:

https://github.com/nickdunn/elasticsearch/blob/master/templates/index.json

It creates two custom analysers named "symphony_fulltext" and "symphony_autocomplete", which use an array of filters. Note that the "symphony_fulltext" analyser uses two custom filters named "custom_synonyms" and "custom_stop", which are defined underneath.

I sent this JSON document when creating the index for the first time, but you can use the update API to send changes to the index after creation (just be sure to use the open/close API first, else it won't work).

So it sounds to me like you want to create your own custom analyser, call it "spanish" and configure your tokenisers and filters there.

N.

On Monday, May 21, 2012 6:28:22 PM UTC+1, Agustin Lopez wrote:
Hello everyone,

I am trying to create the mapping for my first object. It has two
properties for now: an id and a name. The ID is not searchable but the
name is and I would like to use the snowball analyzer configured with
spanish language. How should I set that in json?

{
"possible_client": {
"properties" : {
"id" : {
"type" : "long",
"store" : "yes",
"index" : "not_analyzed"
},
"name" : {
"type" : "string",
"store" : "yes",
"index" : "analyzed",
"analyzer" : "spanish" <--- How to set and configure different
analyzers here?
}
}
}
}

Thanks in advance.


(Ivan Brusic) #4

Agustin,

If you want to define a custom analyzer at the global ES level, you
would need to simply add it to your config file. The default config is
in YAML (elasticsearch.yml), but it can be converted to JSON.

The documentation has an example with custom analyzers/tokenizers/filters:

http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer.html

The configuration code you see needs to be part of elasticsearch.yml.
Every index in the system can you whatever is defined in the
configuration file.

--
Ivan

On Mon, May 21, 2012 at 3:06 PM, Agustin Lopez
agustin.lopez@arancione-consulting.com wrote:

Nick,

Thank you for your reply. Do you know where can I find some documentation
about placing custom analyzers with a configuration file you place on the
server?
Does this impact every index? On the other hand, I understand what you did.
I was trying to do the same thing but from inside the mapping. I guess that
needs to be done on the index level.

I'll give it a try as you said, but if you know where I can find information
about the configuration file that would be ideal.

Thanks.

On May 21, 2012, at 3:55 PM, Nick Dunn wrote:

Agustin,

My understanding is that you can configure custom analysers with a
configuration file, which you either place on your server in the
elasticsearch directory (which it reads when you restart the instance), or
pass your configuration when you create the index via the API. I use the
second method. Here's an example JSON file I use to configure an index:

https://github.com/nickdunn/elasticsearch/blob/master/templates/index.json

It creates two custom analysers named "symphony_fulltext" and
"symphony_autocomplete", which use an array of filters. Note that the
"symphony_fulltext" analyser uses two custom filters named "custom_synonyms"
and "custom_stop", which are defined underneath.

I sent this JSON document when creating the index for the first time, but
you can use the update API to send changes to the index after creation (just
be sure to use the open/close API first, else it won't work).

So it sounds to me like you want to create your own custom analyser, call it
"spanish" and configure your tokenisers and filters there.

N.

On Monday, May 21, 2012 6:28:22 PM UTC+1, Agustin Lopez wrote:

Hello everyone,

I am trying to create the mapping for my first object. It has two
properties for now: an id and a name. The ID is not searchable but the
name is and I would like to use the snowball analyzer configured with
spanish language. How should I set that in json?

{
"possible_client": {
"properties" : {
"id" : {
"type" : "long",
"store" : "yes",
"index" : "not_analyzed"
},
"name" : {
"type" : "string",
"store" : "yes",
"index" : "analyzed",
"analyzer" : "spanish" <--- How to set
and configure different
analyzers here?
}
}
}
}

Thanks in advance.


(Shay Banon) #5

Note that you can also define the analyzers in the create index request.

On Tue, May 22, 2012 at 6:57 PM, Ivan Brusic ivan@brusic.com wrote:

Agustin,

If you want to define a custom analyzer at the global ES level, you
would need to simply add it to your config file. The default config is
in YAML (elasticsearch.yml), but it can be converted to JSON.

The documentation has an example with custom analyzers/tokenizers/filters:

http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer.html

The configuration code you see needs to be part of elasticsearch.yml.
Every index in the system can you whatever is defined in the
configuration file.

--
Ivan

On Mon, May 21, 2012 at 3:06 PM, Agustin Lopez
agustin.lopez@arancione-consulting.com wrote:

Nick,

Thank you for your reply. Do you know where can I find some documentation
about placing custom analyzers with a configuration file you place on the
server?
Does this impact every index? On the other hand, I understand what you
did.
I was trying to do the same thing but from inside the mapping. I guess
that
needs to be done on the index level.

I'll give it a try as you said, but if you know where I can find
information
about the configuration file that would be ideal.

Thanks.

On May 21, 2012, at 3:55 PM, Nick Dunn wrote:

Agustin,

My understanding is that you can configure custom analysers with a
configuration file, which you either place on your server in the
elasticsearch directory (which it reads when you restart the instance),
or
pass your configuration when you create the index via the API. I use the
second method. Here's an example JSON file I use to configure an index:

https://github.com/nickdunn/elasticsearch/blob/master/templates/index.json

It creates two custom analysers named "symphony_fulltext" and
"symphony_autocomplete", which use an array of filters. Note that the
"symphony_fulltext" analyser uses two custom filters named
"custom_synonyms"
and "custom_stop", which are defined underneath.

I sent this JSON document when creating the index for the first time, but
you can use the update API to send changes to the index after creation
(just
be sure to use the open/close API first, else it won't work).

So it sounds to me like you want to create your own custom analyser,
call it
"spanish" and configure your tokenisers and filters there.

N.

On Monday, May 21, 2012 6:28:22 PM UTC+1, Agustin Lopez wrote:

Hello everyone,

I am trying to create the mapping for my first object. It has two
properties for now: an id and a name. The ID is not searchable but the
name is and I would like to use the snowball analyzer configured with
spanish language. How should I set that in json?

{
"possible_client": {
"properties" : {
"id" : {
"type" : "long",
"store" : "yes",
"index" : "not_analyzed"
},
"name" : {
"type" : "string",
"store" : "yes",
"index" : "analyzed",
"analyzer" : "spanish" <--- How to set
and configure different
analyzers here?
}
}
}
}

Thanks in advance.


(system) #6