Searching on Email id's not happening in 16.0


(senthil prabhu) #1

Hi All,

I am using the below mapping in both version 15.2 and 16.0

"mail_from" : {"type" :"string","store" :"yes","format" :"MMM/dd/yyyy
HH:mm:ss","term_vector" :"with_positions_offsets","index" :"analyzed","index_name" :"mail_from","omit_term_freq_and_positions" :false,"omit_norms" :false,"precision_step" :
4,"index_analyzer" :"standard" }

But I didn't retrieve any documents while searching for email id's in
version 16.0.


(ofavre) #2

Hi,

You should use the ids query.
See http://www.elasticsearch.org/guide/reference/query-dsl/ids-query.html

The _id field is no longer indexed, see the first breaking change listed
here:
http://www.elasticsearch.org/download/2011/04/23/0.16.0.html

Regards,
Olivier @Yakaz

2011/5/4 senthil prabhu [via ElasticSearch Users] <
ml-node+2898126-1082462395-393975@n3.nabble.com>

Hi All,

I am using the below mapping in both version 15.2 and 16.0

"mail_from" : {"type" :"string","store" :"yes","format" :"MMM/dd/yyyy
HH:mm:ss","term_vector" :"with_positions_offsets","index"
:"analyzed","index_name" :"mail_from","omit_term_freq_and_positions"
:false,"omit_norms" :false,"precision_step" :
4,"index_analyzer" :"standard" }

But I didn't retrieve any documents while searching for email id's in
version 16.0.


If you reply to this email, your message will be added to the discussion
below:

http://elasticsearch-users.115913.n3.nabble.com/Searching-on-Email-id-s-not-happening-in-16-0-tp2898126p2898126.html
To start a new topic under ElasticSearch Users, email
ml-node+115913-1699315842-393975@n3.nabble.com
To unsubscribe from ElasticSearch Users, click herehttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=115913&code=b2xpdmllckB5YWthei5jb218MTE1OTEzfDIxMjI2MTYwOTc=.


(senthil prabhu) #3

thank you.... My doubt is searching on emails ("senthil@gamil") not
working in version 16.0... What i need to do ...?

On May 4, 3:18 pm, ofavre oliv...@yakaz.com wrote:

Hi,

You should use the ids query.
Seehttp://www.elasticsearch.org/guide/reference/query-dsl/ids-query.html

The _id field is no longer indexed, see the first breaking change listed
here:http://www.elasticsearch.org/download/2011/04/23/0.16.0.html

Regards,
Olivier @Yakaz

2011/5/4 senthil prabhu [via ElasticSearch Users] <
ml-node+2898126-1082462395-393...@n3.nabble.com>

Hi All,

I am using the below mapping in both version 15.2 and 16.0

"mail_from" : {"type" :"string","store" :"yes","format" :"MMM/dd/yyyy
HH:mm:ss","term_vector" :"with_positions_offsets","index"
:"analyzed","index_name" :"mail_from","omit_term_freq_and_positions"
:false,"omit_norms" :false,"precision_step" :
4,"index_analyzer" :"standard" }

But I didn't retrieve any documents while searching for email id's in
version 16.0.


If you reply to this email, your message will be added to the discussion
below:

http://elasticsearch-users.115913.n3.nabble.com/Searching-on-Email-id...
To start a new topic under ElasticSearch Users, email
ml-node+115913-1699315842-393...@n3.nabble.com
To unsubscribe from ElasticSearch Users, click herehttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.....

--
View this message in context:http://elasticsearch-users.115913.n3.nabble.com/Searching-on-Email-id...
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(Clinton Gormley) #4

On Wed, 2011-05-04 at 03:50 -0700, senthil prabhu wrote:

thank you.... My doubt is searching on emails ("senthil@gamil") not
working in version 16.0... What i need to do ...?

The first thing that you need to do is to provide an example of what you
are already doing. See http://www.elasticsearch.org/help

Also, why do you have a date format in a string property?

"mail_from" : {"type" :"string","store" :"yes","format" :"MMM/dd/yyyy
HH:mm:ss","term_vector" :"with_positions_offsets","index"
:"analyzed","index_name" :"mail_from","omit_term_freq_and_positions"
:false,"omit_norms" :false,"precision_step" :
4,"index_analyzer" :"standard" }

Emails are analyzed differently in 0.16, using the standard analyzer.

So if you are doing a term query for the email, you won't find it. In
0.15, the email "senthil@gmail.com" would have resulted in the term
"senthil@gmail.com". In 0.16, it results in the terms
"senthil","gmail","com"

clint


(senthil prabhu) #5

Ya you are exactly right...

1.) What type of mapping that i need to use to get the full email
while search?
or what type search query that i need to use instead of term
query ?

On May 4, 3:57 pm, Clinton Gormley clin...@iannounce.co.uk wrote:

On Wed, 2011-05-04 at 03:50 -0700, senthil prabhu wrote:

thank you.... My doubt is searching on emails ("senthil@gamil") not
working in version 16.0... What i need to do ...?

The first thing that you need to do is to provide an example of what you
are already doing. Seehttp://www.elasticsearch.org/help

Also, why do you have a date format in a string property?

"mail_from" : {"type" :"string","store" :"yes","format" :"MMM/dd/yyyy
HH:mm:ss","term_vector" :"with_positions_offsets","index"
:"analyzed","index_name" :"mail_from","omit_term_freq_and_positions"
:false,"omit_norms" :false,"precision_step" :
4,"index_analyzer" :"standard" }

Emails are analyzed differently in 0.16, using the standard analyzer.

So if you are doing a term query for the email, you won't find it. In
0.15, the email "sent...@gmail.com" would have resulted in the term
"sent...@gmail.com". In 0.16, it results in the terms
"senthil","gmail","com"

clint


(Clinton Gormley) #6

On Wed, 2011-05-04 at 04:18 -0700, senthil prabhu wrote:

Ya you are exactly right...

1.) What type of mapping that i need to use to get the full email
while search?
or what type search query that i need to use instead of term
query ?

You have a few choices.

With your existing mapping, if your mail_from field contains exactly one
email address, then you could use a query string search, and put the
email address into double quotes, eg: '"senthil@gmail.com"'

However that would also find "foobar.senthil@gmail.com"

So if you want to use the email address as a unique ID, you probably do
want to do a term query, and you could set the mail_from property to
{"type": "string", "index":"not_analyzed"}

However, the term "senthil@gmail.com" is not the same as the term
"Senthil@GMAIL.com"

So make sure that you lowercase both the data that you index, and the
term that you search for. (and you may need to trim any leading or
trailing whitespace as well)

Also, would you want to do searches for any email address containing
"senthil" or "gmail"?

If so, then you need both the not_analyzed field, and an analyzed field.
You could do this with a multi field:

curl -XPUT 'http://127.0.0.1:9200/foo/email/_mapping?pretty=1' -d '
{
"email" : {
"properties" : {
"mail_from" : {
"fields" : {
"id" : {
"index" : "not_analyzed",
"type" : "string"
},
"mail_from" : {
"index" : "analyzed",
"type" : "string"
}
},
"type" : "multi_field"
}
}
}
}
'

See http://www.elasticsearch.org/guide/reference/mapping/multi-field-type.html

Then you can use these as follows:

Search for "gmail":

curl -XGET 'http://127.0.0.1:9200/foo/mail/_search?pretty=1' -d '
{
"query" : {
"field" : {
"mail_from" : "gmail"
}
}
}
'

Find exactly "senthil@gmail.com":

curl -XGET 'http://127.0.0.1:9200/foo/mail/_search?pretty=1' -d '
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"mail_from.id" : "senthil@gmail.com"
}
}
}
}
}
'

clint


(senthil prabhu) #7

thank you very Clinton....

On May 4, 4:35 pm, Clinton Gormley clin...@iannounce.co.uk wrote:

On Wed, 2011-05-04 at 04:18 -0700, senthil prabhu wrote:

Ya you are exactly right...

1.) What type of mapping that i need to use to get the full email
while search?
or what type search query that i need to use instead of term
query ?

You have a few choices.

With your existing mapping, if your mail_from field contains exactly one
email address, then you could use a query string search, and put the
email address into double quotes, eg: '"sent...@gmail.com"'

However that would also find "foobar.sent...@gmail.com"

So if you want to use the email address as a unique ID, you probably do
want to do a term query, and you could set the mail_from property to
{"type": "string", "index":"not_analyzed"}

However, the term "sent...@gmail.com" is not the same as the term
"Sent...@GMAIL.com"

So make sure that you lowercase both the data that you index, and the
term that you search for. (and you may need to trim any leading or
trailing whitespace as well)

Also, would you want to do searches for any email address containing
"senthil" or "gmail"?

If so, then you need both the not_analyzed field, and an analyzed field.
You could do this with a multi field:

curl -XPUT 'http://127.0.0.1:9200/foo/email/_mapping?pretty=1' -d '
{
"email" : {
"properties" : {
"mail_from" : {
"fields" : {
"id" : {
"index" : "not_analyzed",
"type" : "string"
},
"mail_from" : {
"index" : "analyzed",
"type" : "string"
}
},
"type" : "multi_field"
}
}
}}

'

Seehttp://www.elasticsearch.org/guide/reference/mapping/multi-field-type...

Then you can use these as follows:

Search for "gmail":

curl -XGET 'http://127.0.0.1:9200/foo/mail/_search?pretty=1' -d '
{
"query" : {
"field" : {
"mail_from" : "gmail"
}
}}

'

Find exactly "sent...@gmail.com":

curl -XGET 'http://127.0.0.1:9200/foo/mail/_search?pretty=1' -d '
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"mail_from.id" : "sent...@gmail.com"
}
}
}
}}

'

clint


(system) #8