Facet not working for Email fields in ES version 16.0


(senthil prabhu) #1

Hi All,

I had inserted a value "xxxx@gmail.com,yyyy@yahoo.com,zzzz@rediff.com"
for a field named
"fromfield"

  1. I am using the below query for faceting in ES15.2 version and also
    in ES16.0 version

{ "query" : { "match_all" : { } }, "from" : 0, "size" : 0, "facets" :
{"tagfromfield" : { "terms" : { "field" :"fromfield" ,"regex" :
".*" ,"regex_flags" : "DOTALL","size" : 1000 } } , "global" : false },
"explain" : false }

In ES 15.2 I am getting the result as

xxxx@gmail.com
yyyy@yahoo.com
zzzz@rediff.com

but in ES 16.0 version I am getting the result as

xxxx
gmail.com
yyyy


zzzz
rediff.com

What is solution to get separate email id's while faceting in ES
16.0 ...........


(Clinton Gormley) #2

Hi Senthil

I had inserted a value "xxxx@gmail.com,yyyy@yahoo.com,zzzz@rediff.com"
for a field named
"fromfield"

but in ES 16.0 version I am getting the result as

xxxx
gmail.com
yyyy
yahoo.com
zzzz
rediff.com

You will need to set the mapping for the fromfield to use
{"type": "string", "index": "not_analyzed"}

clint


(senthil prabhu) #3

Hi Clinton

I have tried the below Gist query but i am getting the same problem
with not_analyzed mapping...

$ curl -XPUT 'http://localhost:9200/dbmail1/'
{"ok":true,"acknowledged":true}

$ curl -XPUT 'http://localhost:9200/dbmail1/metadata/_mapping' -d
'{"metadata" : {"dynamic" :"true","_source" :
{"enabled" :true },"properties" : {"fromfi
eld" : {"type" :"string","index" :"not_analyzed"}} } }'
{"ok":true,"acknowledged":true}

$ curl -XPUT 'http://localhost:9200/dbmail1/metadata/1' -d
'{"fromfield":"[zzzz@rediff.com,aaaa@hhh.com,llll@kkkk.com]"}'
{"ok":true,"_index":"dbmail1","_type":"metadata","_id":"4","_version":
1}

$ curl -XPUT 'http://localhost:9200/dbmail1/metadata/2' -d
'{"fromfield":"zzzz@rediff.com,aaaa@hhh.com,llll@kkkk.com"}'
{"ok":true,"_index":"dbmail1","_type":"metadata","_id":"5","_version":
1}

$ curl -XGET 'http://localhost:9200/dbmail1/metadata/_search' -d
'{ "query" : { "match_all" : { } }, "from" : 0, "size" : 0, "facets" :
{"tagfromfield" : {
"terms" : { "field" :"fromfield" ,"regex" :".*" ,"regex_flags" :
"DOTALL","size" : 1000 } } , "global" : false },"explain" : false }'

{"took":0,"timed_out":false,"_shards":{"total":5,"successful":
5,"failed":0},"hits":{"total":5,"max_score":1.0,"hits":[]},"facets":
{"tagfromfield":{"_type":"term
s","missing":0,"terms":
[{"term":"zzzz@rediff.com,aaaa@hhh.com,llll@kkkk.com","count":1},
{"term":"[zzzz@rediff.com,aaaa@hhh.com,llll@kkkk.com]","count":1}]}}}

I need the result like...

{"took":0,"timed_out":false,"_shards":{"total":5,"successful":
5,"failed":0},"hits":{"total":5,"max_score":1.0,"hits":[]},"facets":
{"tagfromfield":{"_type":"term
s","missing":0,"terms":[{"term":"zzzz@rediff.com","count":2},
{"term":"aaaa@hhh.com","count":2},{"term":"llll@kkkk.com","count":
2}]}}}

On Apr 27, 5:21 pm, Clinton Gormley clin...@iannounce.co.uk wrote:

Hi Senthil

I had inserted a value "x...@gmail.com,y...@yahoo.com,z...@rediff.com"
for a field named
"fromfield"
but in ES 16.0 version I am getting the result as

xxxx
gmail.com
yyyy
yahoo.com
zzzz
rediff.com

You will need to set the mapping for the fromfield to use
{"type": "string", "index": "not_analyzed"}

clint


(Clinton Gormley) #4

Hi Senthil

$ curl -XPUT 'http://localhost:9200/dbmail1/metadata/1' -d
'{"fromfield":"[zzzz@rediff.com,aaaa@hhh.com,llll@kkkk.com]"}'

Your JSON is wrong. It should be:

curl -XPUT 'http://localhost:9200/dbmail1/metadata/1' -d
'{"fromfield":["zzzz@rediff.com","aaaa@hhh.com","llll@kkkk.com"]}'

clint


(senthil prabhu) #5

But i have tried the same query in version 15.2 it's working fine....

On Apr 27, 6:03 pm, Clinton Gormley clin...@iannounce.co.uk wrote:

Hi Senthil

$ curl -XPUT 'http://localhost:9200/dbmail1/metadata/1'-d
'{"fromfield":"[z...@rediff.com,a...@hhh.com,l...@kkkk.com]"}'

Your JSON is wrong. It should be:

curl -XPUT 'http://localhost:9200/dbmail1/metadata/1'-d
'{"fromfield":["z...@rediff.com","a...@hhh.com","l...@kkkk.com"]}'

clint


(Shay Banon) #6

The way emails are broken down into terms has changed. Before, the mail was a single token/term, now, its broken down into several terms. You should specify it as not_analyzed.
On Wednesday, April 27, 2011 at 4:28 PM, senthil prabhu wrote:

But i have tried the same query in version 15.2 it's working fine....

On Apr 27, 6:03 pm, Clinton Gormley clin...@iannounce.co.uk wrote:

Hi Senthil

$ curl -XPUT 'http://localhost:9200/dbmail1/metadata/1'-d
'{"fromfield":"[z...@rediff.com,a...@hhh.com,l...@kkkk.com]"}'

Your JSON is wrong. It should be:

curl -XPUT 'http://localhost:9200/dbmail1/metadata/1'-d
'{"fromfield":["z...@rediff.com","a...@hhh.com","l...@kkkk.com"]}'

clint


(srrin) #7

Hi Shay,
Thank you. Since the emails are broken in to multiple terms, do we need to re-index the existing indexes if I upgrade to 0.16.0?

Is there any other work around to achieve this without re-indexing?

Thanks
SRR


(Shay Banon) #8

You can keep the same analysis logic you had before in 0.15. Not sure regarding your analysis configuration, but a simple option can be to set the default analyzer to work in a similar manner to Lucene 3.0. Here is a sample of how to do it: https://gist.github.com/950318 (you can configure it on the elasticsearch.json config file, or convert it to a yaml file).
On Saturday, April 30, 2011 at 6:13 PM, srrIN wrote:

Hi Shay, Thank you. Since the emails are broken in to multiple terms, do we need to re-index the existing indexes if I upgrade to 0.16.0? Is there any other work around to achieve this without re-indexing? Thanks SRR
View this message in context: Re: Facet not working for Email fields in ES version 16.0
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(system) #9