Whitespace in search term causes ES to return all entries when ngram analyzer is used

Hi,
I'm using elasticsearch 1.5.2 provided by amazon aws and I'm using the official php library to connect to it. I've got this problem where if i apply ngram analyzer for the searchable fields, and then search through it with a "muilti_match" query, and the search term contains a whitespace, the query returns all the entries instead of just the relevant ones. If I remove the ngram analyzer, then multiword queries behave normally and return the relevant results but then I lose partial matching.

The following is what Iv'e tried:

  1. Set the tokenizer to whitespace or setting it to keyword
  2. Defining ngram as a tokenizer instead of as token filter
  3. Using a pattern tokenizer and specifying whitespace as the pattern so that it splits by space.
  4. Using edge-ngram instead of ngram

Non of the above made any difference. One other thing i tried is word_delimiter filter with catenate_all set to true. The effect of this was to join multiple words into a single word by removing spaces. This, when coupled with ngram filter, seemed to work for some cases, but obviously it's not a viable solution because there are too many edge case that i cant account for ()like when the 2 search terms aren't in the same position.

My requirement is to have partial matching and also to allow a search term with spaces in it.

Following is my code.

// Analyzer
'analysis' => array(
    "filter" => array(
        "ngram_token_filter" => array(
            "type" => "nGram",
            "min_gram" => "1",
            "max_gram" => "15"
        )
    ),
    'analyzer' => array(
        'ngram_analyzer' => array(
            'type' => 'custom',
            'tokenizer' => 'standard',
            'filter' => array(
                'lowercase',
                'ngram_token_filter'
            )
        )
    )
)

// Mapping
'properties' => array(
    'title' => array('type' => 'string', 'analyzer' => 'ngram_analyzer'),
    'description' => array('type' => 'string', 'analyzer' => 'ngram_analyzer'),
    'type' => array('type' => 'string', 'analyzer' => 'ngram_analyzer'),
    'status' => array('type' => 'byte')
)

// Query
'query' => array(
    'filtered' => array(
        'query' => array(
            'multi_match' => array(
                'query' => $searchTerm,
                'type' => 'most_fields',
                "minimum_should_match" => "75%",
                'fields' => array('title^2', 'description', 'type')
            )
        ),
        'filter' => array(
            'bool' => array(
                'must' => array(
                    array(
                        'term' => array(
                            'status' => 1
                        )
                    )
                )
            )
        )
    )
)

Any help would be greatly appreciated. Thanks.

Hi,
So the problem i had was that i hadn't specified the search analyzer. When a whitespace search analyzer was specified in the mapping, the issue was no longer there and i could do ngram partial matching on multi word queries. So that fixed it for me.

// Mapping
'properties' => array(
    'title' => array('type' => 'string', 'analyzer' => 'ngram_analyzer', 'search_analyzer' => 'whitespace'),
    'description' => array('type' => 'string', 'analyzer' => 'ngram_analyzer', 'search_analyzer' => 'whitespace'),
    'type' => array('type' => 'string', 'analyzer' => 'ngram_analyzer', 'search_analyzer' => 'whitespace'),
    'status' => array('type' => 'byte')
)