So , after so much of digging around , I came to ask it over here . Let me start with a simple use case .
curl -XPUT 'localhost:9200/my-index?pretty' -H 'Content-Type: application/json' -d'
{
"mappings": {
"doctype": {
"properties": {
"message": {
"type": "text" }
}
},
"queries": {
"properties": {
"query": {
"type": "percolator"
}
}
}
}
}
'
curl -XPUT 'localhost:9200/my-index/queries/2?refresh&pretty' -H 'Content-Type: application/json' -d'
{
"query" : {
"match_phrase" : {
"message" : "pub/sub"
}
}
}
'
curl -XPUT 'localhost:9200/my-index/queries/1?refresh&pretty' -H 'Content-Type: application/json' -d'
{
"query" : {
"match_phrase" : {
"message" : "x++"
}
}
}
'
Now my problem is if I execute
curl -XGET 'localhost:9200/my-index/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query" : {
"percolate" : {
"field" : "query",
"document_type" : "doctype",
"document" : {
"message" : "A new bonsai pub sub tree in the office x"
}
}
}
}
'
I will get two matched . one for "pub" and other for "x" , as pub'/sub and x++ .. I know , its because of analyzer . But , even in the mapping field if I change to
curl -XPUT 'localhost:9200/my-index?pretty' -H 'Content-Type: application/json' -d'
{
"mappings": {
"doctype": {
"properties": {
"message": {
"type": "string" ,
"index": "not_analyzed" }
}
},
"queries": {
"properties": {
"query": {
"type": "percolator"
}
}
}
}
}
'
then the "message" : "A new bonsai pub sub tree in the office x" will give zero match , because , it passes this entire text / doc as not_analyzed .
In simple any way to solve this issue ? I only want those phrase . non phrase queries to be matched , which are indexed without removing any special charaxcters like / , + etc ?