I've searched across this list on the subject, seeing some stuff, many
strategies, and I'm trying to get one to work here.
So, I tried to index my data with a field named auto_complete using a
different analyzer than the default one for the field name:
{
"analysis" : {
"analyzer" : {
"auto_complete" : {
"type" : "custom",
"tokenizer" : "custom_edgeNGram",
"filter" : ["lowercase","custom_ngram"]
}
},
"tokenizer" : {
"custom_edgeNGram" : {
"type" : "edgeNGram",
"min_gram" : 2,
"max_gram" : 5
}
},
"filter" : {
"custom_ngram" : {
"type" : "nGram",
"min_gram" : 2,
"max_gram" : 5
}
}
}
}
{
"artist" : {
"properties" : {
"artist_id" : {"type" : "integer", "store":"no", index:"no",
"include_in_all" : "false"} ,
"name" : {"type" : "string", "store" : "no", "include_in_all" : "true" } ,
"rating" : {"type" : "float", "store":"no", "include_in_all" : "false"},
"elevation" : {"type" : "float", "store":"no" , "include_in_all" : "false",
"index" : "no"},
"alpha" : {"type" : "integer", "store" : "no", "include_in_all" : "false" },
"auto_complete" : {"type" : "string", "store":"no", "include_in_all" :
"false" , "analyzer" : "auto_complete"}
}
}
}
Well, using this new field for auto_complete queries is ok as long as I'm
searching for only one term. (I tried a mix of text, query_string queries).
There was a message on this list about the same problem, but no answer
So basically searching for "Pink" would bring Pink, Pink Floyd, Pink
Panther ... but searching for "Pink Fl" one would expect to have a
different result, but if I search on the nGrams I get the same output.
Sometimes even "worse" results.
I'd like to hear, what's the general strategy you guys use for
auto_complete. I had some success using boolean query with AND operators on
each term and "text_phrase_prefix" but, the response time was really high
(something I really would not like for an autocomplete function).
Regards