Unable to preserve special characters in search results of ElasticSearch

Hi,

I want to preserve the special characters like -, /, (, ) in search
results.

Ex: Abc/def a(bc)def a-bcd
response: If I enter Abc/ the records containing abc/ need to come,
similarly for abc/def records with the following need to come, in the same
way abc/def a(bc) records with the similar combination need to come.

For this I have used different analyzers and filters and the settings and
mappings as follows:

PUT /facebook?pretty=true
{
"settings" : {
"analysis" : {
"filter" : {
"special_character_spliter" : {
"type" : "word_delimiter",
"preserve_original": "true"
}
},
"analyzer" : {
"my_an" : {
"type" : "pattern",
"pattern" : "[^(/-*\w\p{L}]+",
"tokenizer" : "whitespace",
"filter" : [ "special_character_spliter"]
}
}
}
},
"mappings" : {
"face" : {
"properties" : {
"msg" : {
"type" : "string",
"analyzer": "my_an"

            }, 
            "name" : { 
                "type" : "string" 
            } 
        } 
    } 
} 

}

and even I have tried with the following settings also:

PUT facebook?pretty=true
{
"settings": {
"index.number_of_replicas": 0,
"analysis": {
"analyzer": {
"msg_excp_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filters": ["my_word_delimiter","my_word_pattern",
"lowercase",
"asciifolding",
"shingle",
"standard"]
}
},
"filters": {
"my_word_pattern":
{
"type" : "pattern_capture",
"preserve_original" : 1,
"patterns" : [
"(\w+)",
"(\p{L}+)",
"(\d+)",
"@(.+)"
]

            }, 
            "my_word_delimiter": { 
                "type": "word_delimiter", 
                "preserve_original": "true", 
                  "catenate_all": "true", 
                  "type_table": { 
                    "$": "DIGIT", 
                    "%": "DIGIT", 
                    ".": "DIGIT", 
                    ",": "DIGIT", 
                    ":": "DIGIT", 
                    "/": "DIGIT", 
                    "\\": "DIGIT", 
                    "=": "DIGIT", 
                    "&": "DIGIT", 
                    "(": "DIGIT", 
                    ")": "DIGIT", 
                    "<": "DIGIT", 
                    ">": "DIGIT", 
                    "\\U+000A": "DIGIT" 

            }, 
            "my_asciifolding": { 
                "type": "asciifolding", 
                "preserve_original": true 
            } 
        } 
    } 
} 
}, 
"mappings": { 
    "face": { 
        "properties": { 
            "msg": { 
                "type": "string", 
                "index": "analyzed", 
                "analyzer": "msg_excp_analyzer" 
            }, 
            "name": { 
                "type": "string", 
                "index": "analyzed", 
                "analyzer": "msg_excp_analyzer" 
            } 
        } 
    } 
} 

}

With above settings I have tried separately inorder to preserve the special
characters in search results,
and i have used query like:

GET facebook/face/_search
{

"query": {
"bool": {
"must": [
{
"query_string": {
"fields": [
"msg"
],
"query": "a/"
}
}
]
}

} 

}

Am getting the following error:

"error": "SearchPhaseExecutionException[Failed to execute phase [query],
all shards failed; shardFailures {[trVZmtb4Qu-QWIqxK5hEfg][facebook][0]:
SearchParseException[[facebook][0]:

May I know what is the problem with , whether is it with my query or
settings?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a33a7559-3a09-4c24-a8a7-101d6e9fbda0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi
I think You need to escape those spl char in search string, like
{
"query": {
"bool": {
"must": [
{
"query_string": {
"fields": [
"msg"
],
"query": "a\/"
}
}
]
}
}
}

If I escape that character '/' I will get the data irrelevant to the search
term

eg: if a/b, a/c, a/d, a(b), a(c), a-b, ab is my data

My requirement is if I enter the term a( then I have to get only a( as a
result but not ab,a-b,...

I would like to get the results without escaping them, the result need to
preserve the special character '/' , any query which is relevant to that
will be helpful and the search term need to search in different fields but
no need to match in all fields , any one field match result is required.

On Monday, March 23, 2015 at 7:50:08 PM UTC+5:30, Periyandavar wrote:

Hi
I think You need to escape those spl char in search string, like
{
"query": {
"bool": {
"must": [
{
"query_string": {
"fields": [
"msg"
],
"query": "a\/"
}
}
]
}
}
}

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Unable-to-preserve-special-characters-in-search-results-of-ElasticSearch-tp4072409p4072418.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0c8b7093-5dae-4ea5-9b60-bf5353f6d985%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi Thanks,

I resolved this issue,

In order to preserve the special characters and to search the query term in multiple fields for exact match it is better to change the settings as shown below.

Settings I updated:

PUT /my_index/_settings?pretty=true
{
"settings" : {
"analysis": {
"analyzer": {
"wordAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"word_delimiter_for_phone","nGram_filter"
]
}
},
"filter": {
"word_delimiter_for_phone": {
"type": "word_delimiter",
"catenate_all": true,
"generate_number_parts ": false,
"split_on_case_change": false,
"generate_word_parts": false,
"split_on_numerics": false,
"preserve_original": true
},

        "nGram_filter": {
           "type": "nGram",
           "min_gram": 1,
           "max_gram": 20,
           "token_chars": [
              "letter"
           ]
        }
  }
}

}
}

Mapping settings:
{
“mappings”:{

"face" : {
"properties" : {
"{field-1}id" : {
"type" : "string",
"index_name" : "change",
"analyzer": "wordAnalyzer"

        },
        "{field-2}name" : {
            "type" : "string",
           "index_name" : "change",
           "analyzer": "wordAnalyzer"
        },
        "{field-3}Year":
        {
            "type" : "string",
           "index_name" : "change",
           "analyzer": "wordAnalyzer"
        },
        "{field-4}Make":
        {
            "type" : "string",
           "index_name" : "change",
           "analyzer": "wordAnalyzer"
        }
     }
  } 

and the query we can use:

GET my_index/face/_search
{
"query": {
"match": {
"change":
{
"query": "A/T o",
"type": "phrase_prefix"
}
}
}
}

By this we can search for that term in all the fields. In order to search in only single field we can give that field name in the place of "change" in match query.

And for to change in mappings I am able to update the analyzers, but not the index_name, for to add index_name I have deleted the index and again done the mapping as above.

Hi Thanks,

I resolved this issue,

In order to preserve the special characters and to search the query term in
multiple fields for exact match it is better to change the settings as
shown below.

Settings I updated:

PUT /my_index/_settings?pretty=true
{
"settings" : {
"analysis": {
"analyzer": {
"wordAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"word_delimiter_for_phone","nGram_filter"
]
}
},
"filter": {
"word_delimiter_for_phone": {
"type": "word_delimiter",
"catenate_all": true,
"generate_number_parts ": false,
"split_on_case_change": false,
"generate_word_parts": false,
"split_on_numerics": false,
"preserve_original": true
},

        "nGram_filter": { 
           "type": "nGram", 
           "min_gram": 1, 
           "max_gram": 20, 
           "token_chars": [ 
              "letter" 
           ] 
        } 
  } 
} 

}
}

Mapping settings:
{
“mappings”:{

"face" : {
"properties" : {
"{field-1}id" : {
"type" : "string",
"index_name" : "change",
"analyzer": "wordAnalyzer"

        }, 
        "{field-2}name" : { 
            "type" : "string", 
           "index_name" : "change", 
           "analyzer": "wordAnalyzer" 
        }, 
        "{field-3}Year": 
        { 
            "type" : "string", 
           "index_name" : "change", 
           "analyzer": "wordAnalyzer" 
        }, 
        "{field-4}Make": 
        { 
            "type" : "string", 
           "index_name" : "change", 
           "analyzer": "wordAnalyzer" 
        } 
     } 
  } 

and the query we can use:

GET my_index/face/_search
{
"query": {
"match": {
"change":
{
"query": "A/T o",
"type": "phrase_prefix"
}
}
}
}

By this we can search for that term in all the fields. In order to search
in only single field we can give that field name in the place of "change"
in match query.

And for to change in mappings I am able to update the analyzers, but not
the index_name, for to add index_name I have deleted the index and again
done the mapping as above.

On Tuesday, March 24, 2015 at 3:50:52 PM UTC+5:30, Muddadi Hemaanusha wrote:

If I escape that character '/' I will get the data irrelevant to the
search term

eg: if a/b, a/c, a/d, a(b), a(c), a-b, ab is my data

My requirement is if I enter the term a( then I have to get only a( as a
result but not ab,a-b,...

I would like to get the results without escaping them, the result need to
preserve the special character '/' , any query which is relevant to that
will be helpful and the search term need to search in different fields but
no need to match in all fields , any one field match result is required.

On Monday, March 23, 2015 at 7:50:08 PM UTC+5:30, Periyandavar wrote:

Hi
I think You need to escape those spl char in search string, like
{
"query": {
"bool": {
"must": [
{
"query_string": {
"fields": [
"msg"
],
"query": "a\/"
}
}
]
}
}
}

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Unable-to-preserve-special-characters-in-search-results-of-ElasticSearch-tp4072409p4072418.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/db074925-217f-48a9-8d19-6960282e4a7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.