Possible to lowercase filter _index and _id fields

I like that the internal _index and _id fields are available to search
on. However, I am moving from a domain where we could do case
insensitive searches on these fields. In order to do this, I am trying
to add a lowercase filter analyzer on these fields, but with no luck.

After creating the index, I have tried the following:

curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping' -d '
{
"tweet" : {
"_index" : { "enabled" : true, "analyzer" : "lowercase" }
}
}
'
curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping' -d '
{
"tweet" : {
"_index" : { "enabled" : true },
"properties" : {
"_index" : {"type" : "string", "analyzer" :
"sortable_tokenizer" }
}
}
}
'

Neither seem to work.

If analysis is not available on _id/_index fields, not the a big deal,
I just need to disable the ES built in ones and add use my own.

Thanks,
Paul

Hi, yea, those are not analyzed and you can't control it (intentionally,
since analysis is can potentially break it up to more than a single term).

On Fri, Sep 17, 2010 at 7:11 PM, Paul ppearcy@gmail.com wrote:

I like that the internal _index and _id fields are available to search
on. However, I am moving from a domain where we could do case
insensitive searches on these fields. In order to do this, I am trying
to add a lowercase filter analyzer on these fields, but with no luck.

After creating the index, I have tried the following:

curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping' -d '
{
"tweet" : {
"_index" : { "enabled" : true, "analyzer" : "lowercase" }
}
}
'
curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping' -d '
{
"tweet" : {
"_index" : { "enabled" : true },
"properties" : {
"_index" : {"type" : "string", "analyzer" :
"sortable_tokenizer" }
}
}
}
'

Neither seem to work.

If analysis is not available on _id/_index fields, not the a big deal,
I just need to disable the ES built in ones and add use my own.

Thanks,
Paul

Cool, thx.

Will create my own versions.

On Sep 17, 11:18 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi, yea, those are not analyzed and you can't control it (intentionally,
since analysis is can potentially break it up to more than a single term).

On Fri, Sep 17, 2010 at 7:11 PM, Paul ppea...@gmail.com wrote:

I like that the internal _index and _id fields are available to search
on. However, I am moving from a domain where we could do case
insensitive searches on these fields. In order to do this, I am trying
to add a lowercase filter analyzer on these fields, but with no luck.

After creating the index, I have tried the following:

curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping'-d '
{
"tweet" : {
"_index" : { "enabled" : true, "analyzer" : "lowercase" }
}
}
'
curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping'-d '
{
"tweet" : {
"_index" : { "enabled" : true },
"properties" : {
"_index" : {"type" : "string", "analyzer" :
"sortable_tokenizer" }
}
}
}
'

Neither seem to work.

If analysis is not available on _id/_index fields, not the a big deal,
I just need to disable the ES built in ones and add use my own.

Thanks,
Paul

Why not lowercase before hand, so you won't store extra data in the index?

On Fri, Sep 17, 2010 at 7:33 PM, Paul ppearcy@gmail.com wrote:

Cool, thx.

Will create my own versions.

On Sep 17, 11:18 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi, yea, those are not analyzed and you can't control it (intentionally,
since analysis is can potentially break it up to more than a single
term).

On Fri, Sep 17, 2010 at 7:11 PM, Paul ppea...@gmail.com wrote:

I like that the internal _index and _id fields are available to search
on. However, I am moving from a domain where we could do case
insensitive searches on these fields. In order to do this, I am trying
to add a lowercase filter analyzer on these fields, but with no luck.

After creating the index, I have tried the following:

curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping'-d '
{
"tweet" : {
"_index" : { "enabled" : true, "analyzer" : "lowercase" }
}
}
'
curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping'-d '
{
"tweet" : {
"_index" : { "enabled" : true },
"properties" : {
"_index" : {"type" : "string", "analyzer" :
"sortable_tokenizer" }
}
}
}
'

Neither seem to work.

If analysis is not available on _id/_index fields, not the a big deal,
I just need to disable the ES built in ones and add use my own.

Thanks,
Paul

I'd like to keep the searches on id and index case insensitive to
avoid any confusion and map as closely to the system we are
replacing.

So, the two alternatives are:

  • Lower case on the indexing side and on the search side
  • Add lowercase keyword analyzer and have my own versions of these
    fields

If I then wanted to do case sensitive searches, I would need to add
logic to the search side lower casing to only target specific fields,
which gets a little ugly.

The extra _id field (no way to disable this and _type, right?) adds a
little extra bloat to the index, but I'll take it to get the search
case-insensitive.

Maybe it makes sense to allow a filter to be applied to the internal
queryable fields?

Thanks,
Paul

On Sep 17, 11:35 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Why not lowercase before hand, so you won't store extra data in the index?

On Fri, Sep 17, 2010 at 7:33 PM, Paul ppea...@gmail.com wrote:

Cool, thx.

Will create my own versions.

On Sep 17, 11:18 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi, yea, those are not analyzed and you can't control it (intentionally,
since analysis is can potentially break it up to more than a single
term).

On Fri, Sep 17, 2010 at 7:11 PM, Paul ppea...@gmail.com wrote:

I like that the internal _index and _id fields are available to search
on. However, I am moving from a domain where we could do case
insensitive searches on these fields. In order to do this, I am trying
to add a lowercase filter analyzer on these fields, but with no luck.

After creating the index, I have tried the following:

curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping'-d'
{
"tweet" : {
"_index" : { "enabled" : true, "analyzer" : "lowercase" }
}
}
'
curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping'-d'
{
"tweet" : {
"_index" : { "enabled" : true },
"properties" : {
"_index" : {"type" : "string", "analyzer" :
"sortable_tokenizer" }
}
}
}
'

Neither seem to work.

If analysis is not available on _id/_index fields, not the a big deal,
I just need to disable the ES built in ones and add use my own.

Thanks,
Paul

The _id and _type are required. I understand what you are trying to do, in
this case, I suggest you go with adding two fields lowercasing it.

On Fri, Sep 17, 2010 at 9:23 PM, Paul ppearcy@gmail.com wrote:

I'd like to keep the searches on id and index case insensitive to
avoid any confusion and map as closely to the system we are
replacing.

So, the two alternatives are:

  • Lower case on the indexing side and on the search side
  • Add lowercase keyword analyzer and have my own versions of these
    fields

If I then wanted to do case sensitive searches, I would need to add
logic to the search side lower casing to only target specific fields,
which gets a little ugly.

The extra _id field (no way to disable this and _type, right?) adds a
little extra bloat to the index, but I'll take it to get the search
case-insensitive.

Maybe it makes sense to allow a filter to be applied to the internal
queryable fields?

Thanks,
Paul

On Sep 17, 11:35 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Why not lowercase before hand, so you won't store extra data in the
index?

On Fri, Sep 17, 2010 at 7:33 PM, Paul ppea...@gmail.com wrote:

Cool, thx.

Will create my own versions.

On Sep 17, 11:18 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi, yea, those are not analyzed and you can't control it
(intentionally,
since analysis is can potentially break it up to more than a single
term).

On Fri, Sep 17, 2010 at 7:11 PM, Paul ppea...@gmail.com wrote:

I like that the internal _index and _id fields are available to
search
on. However, I am moving from a domain where we could do case
insensitive searches on these fields. In order to do this, I am
trying
to add a lowercase filter analyzer on these fields, but with no
luck.

After creating the index, I have tried the following:

curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping'-d'
{
"tweet" : {
"_index" : { "enabled" : true, "analyzer" : "lowercase" }
}
}
'
curl -XPUT 'http://localhost:9201/twitter/tweet/_mapping'-d'
{
"tweet" : {
"_index" : { "enabled" : true },
"properties" : {
"_index" : {"type" : "string", "analyzer" :
"sortable_tokenizer" }
}
}
}
'

Neither seem to work.

If analysis is not available on _id/_index fields, not the a big
deal,
I just need to disable the ES built in ones and add use my own.

Thanks,
Paul