I'm using elasticsearch 1.5.2 provided by amazon aws and I'm using the official php library to connect to it. I've got this problem where if i apply ngram analyzer for the searchable fields, and then search through it with a "muilti_match" query, and the search term contains a whitespace, the query returns all the entries instead of just the relevant ones. If I remove the ngram analyzer, then multiword queries behave normally and return the relevant results but then I lose partial matching.
The following is what Iv'e tried:
- Set the tokenizer to
whitespaceor setting it to
- Defining ngram as a
tokenizerinstead of as
- Using a
patterntokenizer and specifying whitespace as the pattern so that it splits by space.
edge-ngraminstead of ngram
Non of the above made any difference. One other thing i tried is
word_delimiter filter with
catenate_all set to true. The effect of this was to join multiple words into a single word by removing spaces. This, when coupled with ngram filter, seemed to work for some cases, but obviously it's not a viable solution because there are too many edge case that i cant account for ()like when the 2 search terms aren't in the same position.
My requirement is to have partial matching and also to allow a search term with spaces in it.
Following is my code.
// Analyzer 'analysis' => array( "filter" => array( "ngram_token_filter" => array( "type" => "nGram", "min_gram" => "1", "max_gram" => "15" ) ), 'analyzer' => array( 'ngram_analyzer' => array( 'type' => 'custom', 'tokenizer' => 'standard', 'filter' => array( 'lowercase', 'ngram_token_filter' ) ) ) ) // Mapping 'properties' => array( 'title' => array('type' => 'string', 'analyzer' => 'ngram_analyzer'), 'description' => array('type' => 'string', 'analyzer' => 'ngram_analyzer'), 'type' => array('type' => 'string', 'analyzer' => 'ngram_analyzer'), 'status' => array('type' => 'byte') ) // Query 'query' => array( 'filtered' => array( 'query' => array( 'multi_match' => array( 'query' => $searchTerm, 'type' => 'most_fields', "minimum_should_match" => "75%", 'fields' => array('title^2', 'description', 'type') ) ), 'filter' => array( 'bool' => array( 'must' => array( array( 'term' => array( 'status' => 1 ) ) ) ) ) ) )
Any help would be greatly appreciated. Thanks.