Hi, I've set a custom analyzer, but it seems not work when doing search.
ElasticSearch version: 7.0.1
Create index:
PUT /analyze_email
{
"settings": {
"number_of_shards": "1",
"number_of_replicas": "0",
"analysis": {
"filter": {
"email": {
"type": "pattern_capture",
"preserve_original": "true",
"patterns": [
"([^@]+)",
"(\\p{L}+)",
"(\\d+)",
"@(.+)",
"(@)"
]
}
},
"analyzer": {
"email_analyzer": {
"tokenizer": "uax_url_email",
"filter": [
"lowercase",
"email",
"unique"
]
}
},
"normalizer": {
"lowercase_normalizer": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"properties": {
"email_address": {
"type": "text",
"analyzer": "email_analyzer",
"norms": "false",
"search_analyzer": "email_analyzer",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "lowercase_normalizer"
}
}
}
}
}
}
Test email_analyzer, it works.
GET analyze_email/_analyze?pretty
{
"field": "email_address",
"text": "jinliantest@gmail.com"
}
Result:
{
"tokens" : [
{
"token" : "jinliantest@gmail.com",
"start_offset" : 0,
"end_offset" : 21,
"type" : "<EMAIL>",
"position" : 0
},
{
"token" : "jinliantest",
"start_offset" : 0,
"end_offset" : 21,
"type" : "<EMAIL>",
"position" : 0
},
{
"token" : "@",
"start_offset" : 0,
"end_offset" : 21,
"type" : "<EMAIL>",
"position" : 0
},
{
"token" : "gmail.com",
"start_offset" : 0,
"end_offset" : 21,
"type" : "<EMAIL>",
"position" : 0
},
{
"token" : "gmail",
"start_offset" : 0,
"end_offset" : 21,
"type" : "<EMAIL>",
"position" : 0
},
{
"token" : "com",
"start_offset" : 0,
"end_offset" : 21,
"type" : "<EMAIL>",
"position" : 0
}
]
}
I put a doc to the index.
PUT analyze_email/_doc/1
{
"email_address": "jinliantest@gmail.com"
}
While I did the search by "@", it didn't work, it should return doc 1?
GET analyze_email/_search
{
"query": {
"match": {"email_address": "@"}
}
}
Result:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}