favorite
I have a requirement where I need to query docs by phone number. Users can enter characters such as parenthesis and dashes in the search query string and they should be ignored.So, I have created a custom analyzer that uses a char_filter which in its turn uses pattern_replace token filter to remove everything but digits with a regex. But It does not seem like elastic search is filtering out non-digits. Here is a sample of what I am trying to do:
Index Creation
put my_test_index
{
"settings" : {
"index": {
"analysis": {
"char_filter": {
"non_digit": {
"pattern": "\\D",
"type": "pattern_replace",
"replacement": ""
}
},
"analyzer": {
"no_digits_analyzer": {
"type": "custom",
"cahr_filter": [
"non_digit"
],
"tokenizer": "keyword"
}
}
}
}
},
"mappings" : {
"doc_with_phone_prop" : {
"properties": {
"phone": {
"type": "text",
"analyzer": "no_digits_analyzer",
"search_analyzer": "no_digits_analyzer"
}
}
}
}
}
Inserting one doc
put my_test_index/doc_with_phone_prop/1
{
"phone": "3035555555"
}
Querying without any parenthesis or dashes in the phone
post my_test_index/doc_with_phone_prop/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "3035555555",
"fields": ["phone"]
}
}]
}
}
}
This returns one document correctly:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "my_test_index",
"_type": "doc_with_phone_prop",
"_id": "1",
"_score": 0.2876821,
"_source": {
"phone": "3035555555"
}
}
]
}
}
Querying with parenthesis does not return anything, But I was under the assumption that my no_digits_analyzer will remove from the search terms everything but digits.
post my_test_index/doc_with_phone_prop/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "\\(303\\)5555555",
"fields": ["phone"]
}
}]
}
}
}
What am I doing wrong here?
I am using ElasticSearch 5.3.
Thanks.