Find e-mail


(Hugo) #1

Hi all,

Can anyone please help me on this issue? I'm trying to find an e-mail
with elasticsearch but can't seem to get it to work properly.

I've created an index with the following mapping:

$ curl -s -XPUT http://localhost:9200/users/info/_mapping -d '
{
"info" : {
"properties": {
"name" : {
"type": "multi_field",
"fields" : {
"name": {
"type" : "string"
},
"untouched" : {
"type": "string",
"index" : "not_analyzed"
}
}
},
"email" : {"type" : "string", "analyzer":"keyword"}
}
}
}'

Add two documents:

$ curl -XPUT http://localhost:9200/users/info/1 -d '{
"user": "user1",
"email": "user1@email.com",
"message": "Testing"
}'

$ curl -XPUT http://localhost:9200/users/info/2 -d '{
"user": "user2",
"email": "user2@email.com",
"message": "Programming"
}'

Note: I set the email analyzer to "keyword" because in the
presentation http://www.slideshare.net/clintongormley/terms-of-endearment-the-elasticsearch-query-dsl-explained
it is showed that the email is set as a token with this analyzer.

And then I've created the following query:

$ curl -XGET 'http://localhost:9200/users/info/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"query" : { "match_all" : {} },
"filter" : {
"and" : [
{ "term" : { "_all" : "user1@email.com" } },
{ "not": { "filter" : { "ids" : { "values" :
[ "400" ] } } } }
]
}
}
}
}'

But it doesn't work. The result is empty.

I've checked that this query work if I one another field (the name for
instance):

$ curl -XGET 'http://localhost:9200/users/info/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"query" : { "match_all" : {} },
"filter" : {
"and" : [
{ "term" : { "_all" : "user1" } },
{ "not": { "filter" : { "ids" : { "values" :
[ "400" ] } } } }
]
}
}
}
}'

It also works if I try to find only in the field "name":

curl -XGET 'http://localhost:9200/users/info/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"query" : { "match_all" : {} },
"filter" : {
"and" : [
{ "term" : { "email" : "user1@email.com" } },
{ "not": { "filter" : { "ids" : { "values" :
[ "400" ] } } } }
]
}
}
}
}'

Thanks for your help.
Best regards,
Hugo


(Clinton Gormley) #2

Hiya

I've created an index with the following mapping:

        "email" : {"type" : "string", "analyzer":"keyword"}

And then I've created the following query:

$ curl -XGET 'http://localhost:9200/users/info/_search?pretty=1' -d '

      { "term" : { "_all" : "user1@email.com" } },

You have set the 'email' field to use they keyword analyzer, but then
you are querying the _all field, which uses the standard analyzer by
default.

So your email address is not stored in _all as the single term
"user1@email.com".

Try this clause instead:

      { "term" : { "email" : "user1@email.com" } },

clint


(Hugo) #3

Hi Clinton,
Thanks for the help.
Is it possible to change the behaviour of _all so it finds the emails
also?

Thanks again.

Best regards,
Hugo

On 14 Jan, 09:43, Clinton Gormley cl...@traveljury.com wrote:

Hiya

I've created an index with the following mapping:
"email" : {"type" : "string", "analyzer":"keyword"}
And then I've created the following query:

$ curl -XGET 'http://localhost:9200/users/info/_search?pretty=1' -d '
{ "term" : { "_all" : "us...@email.com" } },

You have set the 'email' field to use they keyword analyzer, but then
you are querying the _all field, which uses the standard analyzer by
default.

So your email address is not stored in _all as the single term
"us...@email.com".

Try this clause instead:

      { "term" : { "email" : "us...@email.com" } },

clint


(Clinton Gormley) #4

On Sat, 2012-01-14 at 07:12 -0800, Hugo wrote:

Hi Clinton,
Thanks for the help.
Is it possible to change the behaviour of _all so it finds the emails
also?

Hi Hugo,

you could change the analyzer that _all uses to the uax_url_email
analyzer, which is the same as the standard analyzer except that it
keeps email addresses and urls as single tokens

clint

Thanks again.

Best regards,
Hugo

On 14 Jan, 09:43, Clinton Gormley cl...@traveljury.com wrote:

Hiya

I've created an index with the following mapping:
"email" : {"type" : "string", "analyzer":"keyword"}
And then I've created the following query:

$ curl -XGET 'http://localhost:9200/users/info/_search?pretty=1' -d '
{ "term" : { "_all" : "us...@email.com" } },

You have set the 'email' field to use they keyword analyzer, but then
you are querying the _all field, which uses the standard analyzer by
default.

So your email address is not stored in _all as the single term
"us...@email.com".

Try this clause instead:

      { "term" : { "email" : "us...@email.com" } },

clint


(Hugo) #5

Hi again Clinton,

Thanks for your help.
Can you please tell me how to do that?

I've tried to do:

curl -s -XPUT http://localhost:9200/users/info/_mapping -d '
{
"info" : {
"properties": {
"name" : {
"type": "multi_field",
"fields" : {
"name": {
"type" : "string"
},
"untouched" : {
"type": "string",
"index" : "not_analyzed"
}
}
},
"email" : {"type" : "string", "analyzer":"keyword"},
"_all" : {"type" : "string", "analyzer":"uax_url_email"}
}
}
}'

but id doesn't work. It throws the error:
{"error":"MapperParsingException[Analyzer [uax_url_email] not found
for field [_all]]","status":400}

Thanks again.
Best regards,
Hugo

On 14 Jan, 15:23, Clinton Gormley cl...@traveljury.com wrote:

On Sat, 2012-01-14 at 07:12 -0800, Hugo wrote:

Hi Clinton,
Thanks for the help.
Is it possible to change the behaviour of _all so it finds the emails
also?

Hi Hugo,

you could change the analyzer that _all uses to the uax_url_email
analyzer, which is the same as the standard analyzer except that it
keeps email addresses and urls as single tokens

clint

Thanks again.

Best regards,
Hugo

On 14 Jan, 09:43, Clinton Gormley cl...@traveljury.com wrote:

Hiya

I've created an index with the following mapping:
"email" : {"type" : "string", "analyzer":"keyword"}
And then I've created the following query:

$ curl -XGET 'http://localhost:9200/users/info/_search?pretty=1'-d '
{ "term" : { "_all" : "us...@email.com" } },

You have set the 'email' field to use they keyword analyzer, but then
you are querying the _all field, which uses the standard analyzer by
default.

So your email address is not stored in _all as the single term
"us...@email.com".

Try this clause instead:

      { "term" : { "email" : "us...@email.com" } },

clint


(Clinton Gormley) #6

Hi Hugo

        "_all" : {"type" : "string", "analyzer":"uax_url_email"}

but id doesn't work. It throws the error:
{"error":"MapperParsingException[Analyzer [uax_url_email] not found
for field [_all]]","status":400}

Sorry - the uax_url_email is a tokenizer, not an analyzer, so you need
to create your own analyzer to use it.

You can only do this at index creation time, so delete your current
index, then try this:

curl -XPUT 'http://127.0.0.1:9200/users/?pretty=1' -d '
{
"settings" : {
"analysis" : {
"analyzer" : {
"uax_url_email" : {
"filters" : [
"standard",
"lowercase",
"stop"
],
"tokenizer" : "uax_url_email"
}
}
}
},
"mappings" : {
"info" : {
"properties" : {
"email" : {
"type" : "string",
"analyzer" : "keyword"
},
"_all" : {
"type" : "string",
"analyzer" : "uax_url_email"
},
"name" : {
"fields" : {
"untouched" : {
"index" : "not_analyzed",
"type" : "string"
},
"name" : {
"type" : "string"
}
},
"type" : "multi_field"
}
}
}
}
}
'

clint


(Hugo) #7

Hi Clinton,

Thanks again for your help.
I followed your suggestion but it still does not work. The following
query returns 0 hits:

curl -XGET 'http://localhost:9200/users/info/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"query" : { "match_all" : {} },
"filter" : {
"and" : [
{ "term" : { "_all" : "user1@email.com" } },
{ "not": { "filter" : { "ids" : { "values" :
[ "400" ] } } } }
]
}
}
}
}'

I noticed that if I change the query to find the string "email.com" it
returns 2 hits:

curl -XGET 'http://localhost:9200/users/info/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"query" : { "match_all" : {} },
"filter" : {
"and" : [
{ "term" : { "_all" : "email.com" } },
{ "not": { "filter" : { "ids" : { "values" :
[ "400" ] } } } }
]
}
}
}
}'

Thanks again.
Best regards,
Hugo

On 15 Jan, 10:25, Clinton Gormley cl...@traveljury.com wrote:

Hi Hugo

        "_all" : {"type" : "string", "analyzer":"uax_url_email"}

but id doesn't work. It throws the error:
{"error":"MapperParsingException[Analyzer [uax_url_email] not found
for field [_all]]","status":400}

Sorry - the uax_url_email is a tokenizer, not an analyzer, so you need
to create your own analyzer to use it.

You can only do this at index creation time, so delete your current
index, then try this:

curl -XPUT 'http://127.0.0.1:9200/users/?pretty=1' -d '
{
"settings" : {
"analysis" : {
"analyzer" : {
"uax_url_email" : {
"filters" : [
"standard",
"lowercase",
"stop"
],
"tokenizer" : "uax_url_email"
}
}
}
},
"mappings" : {
"info" : {
"properties" : {
"email" : {
"type" : "string",
"analyzer" : "keyword"
},
"_all" : {
"type" : "string",
"analyzer" : "uax_url_email"
},
"name" : {
"fields" : {
"untouched" : {
"index" : "not_analyzed",
"type" : "string"
},
"name" : {
"type" : "string"
}
},
"type" : "multi_field"
}
}
}
}}

'

clint


(Clinton Gormley) #8

On Sun, 2012-01-15 at 05:57 -0800, Hugo wrote:

Hi Clinton,

Thanks again for your help.
I followed your suggestion but it still does not work. The following
query returns 0 hits:

Oh I should really test this stuff out before posting :slight_smile:

OK, I was defining the analyzer for _all in the wrong place, and I was
misspelling 'filter' as 'filters'.

here is a working version: https://gist.github.com/1620128

clint


(Hugo) #9

Hi Clinton,

Its working. Thanks for all your help!!!

Best regards,
Hugo

On 16 Jan, 10:24, Clinton Gormley cl...@traveljury.com wrote:

On Sun, 2012-01-15 at 05:57 -0800, Hugo wrote:

Hi Clinton,

Thanks again for your help.
I followed your suggestion but it still does not work. The following
query returns 0 hits:

Oh I should really test this stuff out before posting :slight_smile:

OK, I was defining the analyzer for _all in the wrong place, and I was
misspelling 'filter' as 'filters'.

here is a working version:https://gist.github.com/1620128

clint


(mr_max) #10

Hello, Can I change settings without delete index?


(system) #11