Hi all,
I'm trying to break up some strings to use in a full text search leaving
the original field intact. I have created a "full_text" field that is
populated from a "name" field using "copy_to" and an analyzer that looks
like this:
"settings" : {
"analysis": {
"char_filter" : {
"full_text_mapping" : {
"type": "mapping",
"mappings" : [".=>%20", "_=>%20"]
}
},
"analyzer" : {
"full_text_analyzer" : {
"type" : "custom",
"char_filter" : "full_text_mapping",
"tokenizer" : "whitespace",
"filter" : ["lowercase"]
}
}
}
},
As you can see I'm trying to convert '.' and '_' to ' ' before the
whitespace tokenizer kicks in. It's my understanding that the char_filter
will replace those characters with whitespace that the whitespace tokenizer
would then tokenize and then all components could be searchable. For
instance, I would expect "GRIZZLY.BEAR" to be found using both "grizzly"
and "bear". But with the whitespace tokenizer I am not able to find the
document with either term. So what am I not understanding? Full script
showing what I'm doing:
#!/bin/sh
ES=localhost:9200
echo ">>> Deleting _all"
curl -XDELETE $ES/_all
echo ">>> Creating the index 'animals'"
curl -XPUT $ES/animals -d'
{
"settings" : {
"analysis": {
"char_filter" : {
"full_text_mapping" : {
"type": "mapping",
"mappings" : [".=>%20", "_=>%20"]
}
},
"analyzer" : {
"full_text_analyzer" : {
"type" : "custom",
"char_filter" : "full_text_mapping",
"tokenizer" : "whitespace",
"filter" : ["lowercase"]
}
}
}
},
"mappings" : {
"bear" : {
"properties" : {
"suggest" : {
"type" : "completion",
"analyzer" : "simple",
"payloads" : true
},
"full_text" : {
"type" : "string",
"analyzer" : "full_text_analyzer"
},
"name" : {
"type" : "string",
"index" : "not_analyzed",
"copy_to" : "full_text"
}
}
}
}
}' && echo
echo ">>> Indexing the GRIZZLY.BEAR document"
curl -XPOST $ES/animals/bear -d'
{
"name": "GRIZZLY.BEAR"
}
' && echo
curl -XPOST $ES/animals/_flush && echo
Search for the document using the name
echo
echo ">>> Searching for name:GRIZZLY.BEAR"
echo
curl $ES/animals/bear/_search -d'
{
"query" : {
"match" : {
"name" : "GRIZZLY.BEAR"
}
}
}
' && echo
Search for the document using a general term
echo
echo ">>> Searching for full_text:grizzly"
echo
curl $ES/animals/bear/_search -d'
{
"query" : {
"match" : {
"full_text" : "grizzly"
}
}
}
' && echo
Search for the document using a general term
echo
echo ">>> Searching for full_text:bear"
echo
curl $ES/animals/bear/_search -d'
{
"query" : {
"match" : {
"full_text" : "bear"
}
}
}
' && echo
I appreciate any help with this!
Cheers,
Craig
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5fa2347f-3019-4973-9d67-7f18b3dfee9e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.