Here is a quick test case the shows the problem:
curl -XDELETE localhost:9200/twitter | python -mjson.tool
curl -XPOST localhost:9200/twitter -d '
{"index":
{ "number_of_shards": 1,
"analysis": {
"filter": {
"snowball": {
"type" : "snowball",
"language" : "English"
}
},
"analyzer": { "a2" : {
"type":"custom",
"tokenizer": "standard",
"filter": ["lowercase", "snowball"]
}
}
}
}
}
}' | python -mjson.tool
sleep 1
curl -XPUT localhost:9200/twitter/tweet/_mapping -d '{
"tweet" : {
"properties" : {
"_all" : {"type" : "string", "analyzer":"a2"},
"message" : {"type" : "string",
"analyzer":"a2","include_in_all":"true"},
"user": {"type":"string"}
}
}}' | python -mjson.tool
sleep 1
curl -XPUT http://localhost:9200/twitter/tweet/1 -d '{ "user":
"kimchy", "message": "Trying out searching teaching, so far so
good?" }' | python -mjson.tool
sleep 1
curl -XGET localhost:9200/twitter/tweet/_search?q=message:teaching |
python -mjson.tool
sleep 1
curl -XGET localhost:9200/twitter/tweet/_search?q=_all:teaching |
python -mjson.tool
echo "Should have a hit"
On Aug 9, 4:53 pm, Christopher Burkey cbur...@entermediasoftware.com
wrote:
Am having a related issue.
We are not getting results back unless we pre-format the search term.
http://localhost:9200/system/group/_search'-d '{ "query" :
{ "text" : { "_all" : "Testing" } } }
Returns 0 hits
http://localhost:9200/system/group/_search'-d '{ "query" :
{ "text" : { "_all" : "testing" } } }
Returns 0 hits
http://localhost:9200/system/group/_search'-d '{ "query" :
{ "text" : { "_all" : "test" } } }
Returns 1 hit!
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"hits": {
"hits": [
{
"_id": "testid",
"_index": "system",
"_score": 0.13561106,
"_source": {
"id": "testid",
"name": "Testing"
},
"_type": "group"
}
],
"max_score": 0.13561106,
"total": 1
},
"timed_out": false,
"took": 33
Here is my setup:
"cluster_name": "entermedia-test",
"master_node": "LfnoRHg1SYicB1p1rFSdrg",
"metadata": {
"indices": {
"system": {
"aliases": ,
"mappings": {
"group": {
"properties": {
"_all": {
"analyzer": "lowersnowball",
"type": "string"
},
"id": {
"include_in_all": true,
"index": "not_analyzed",
"store": "yes",
"type": "string"
},
"name": {
"include_in_all": true,
"index": "not_analyzed",
"store": "yes",
"type": "string"
}
}
}
},
"settings": {
"index.analysis.analyzer.lowersnowball.filter.0":
"snowball",
"index.analysis.analyzer.lowersnowball.filter.1":
"standard",
"index.analysis.analyzer.lowersnowball.filter.2":
"lowercase",
"index.analysis.analyzer.lowersnowball.tokenizer":
"standard",
"index.analysis.analyzer.lowersnowball.type":
"custom",
"index.number_of_replicas": "1",
"index.number_of_shards": "5"
},
"state": "open"
}
},
On Aug 9, 11:31 am, Jan Fiedler fiedler....@gmail.com wrote:
You are using a prefix filter (in a constant score query). The prefix filter
is similar to a term filter and does not analyze the term (i.e. does not
apply the lowercase filter to your uppercase term). In your scenario, I
think its best to lowercase the term in the query (at the client side) to
match what the analyzer does at indexing time.