Phrase suggester for not_analyzed?

111106 · December 21, 2015, 7:54am

Is it possible to use a phrase suggester on fields that are indexed as not_analyzed? For some reason, I keep getting empty results.

Query looks something like:

POST http://localhost:9200/companies/_suggest
{
  "text" : "angel",
  "names" : {
    "phrase" : {
      "field" : "filter.name",
      "direct_generator": [
        {
          "field" : "filter.name",
          "suggest_mode" : "popular",
          "prefix_length": 2,
          "min_word_length": 3
        }
        ]
    }
  }
}

nik9000 · December 21, 2015, 2:01pm

Probably not. I haven't tried but I know that code reasonably well and it wants to use the analyzer for things. What are you trying to do?

111106 · December 21, 2015, 2:55pm

Well, I'd like to utilize some sort of error correction functionality and, optionally, an autocomplete. I have the same data indexed with nGrams and morphology plugins, but when I use the suggester on those fields, it basically results in some awkward tokens being returned instead.

What would you recommend?

nik9000 · December 21, 2015, 3:16pm

Will the term suggester work for you?

111106 · December 21, 2015, 4:27pm

I don't think it works well on fields analyzed with nGrams. I tried it on not_analyzed fields and pretty much got the same, empty result.

nik9000 · December 21, 2015, 4:41pm

Sorry! I wouldn't point the suggester at a field that uses the ngram tokenizer either....

What about using the keyword analyzer instead of not_analyzed?

111106 · December 21, 2015, 5:29pm

I have to look into it, thank you. The reason I'm having duplicate, non-analyzed data is to be able to fetch documents with term filters. Would a keyword analyzer affect that?

nik9000 · December 21, 2015, 6:34pm

It should look the same as not_analyzed. Its worth testing that it does and that it helps on a smaller dataset first though.

111106 · December 22, 2015, 12:44pm

I've tried indexing a few documents with a keyword analyzer. I've started getting some results, which are still very far from what I'm trying to achieve.

For example, I have a document which has a name 'Music'. If I want a suggestion for 'Musi', ES returns 'Music'. But if I type it with a lowercase, 'musi', it returns nothing, which is definitely not what I want.

I've tried using suggest-time analyzers with lowercasing, but those don't seem to help. Anything else I can do?

nik9000 · December 22, 2015, 2:31pm

Sure! Lots of stuff! It depends on which way you want the suggestions to work:

Use a custom analyzer that has the keyword tokenizer and a lowercase filter. That should make 'musi' return 'music' but should also make 'Musi' return 'music'. Because all the tokens are now lowercased.
You may be able write an analyzer that uppercases the first letter and use it with the analyzer option. I have lots less experience with this and don't remember how that code works.
There looks like there is a lowercase_terms option too which might do something for you as well.

It makes you life much easier if you are ok with 'Musi' suggesting 'music'.

Topic		Replies	Views
Is it possible to get (meaningful) term suggestions from an ngram analyzed field? Elasticsearch	1	303	July 1, 2019
Phrase suggester and ngrams Elasticsearch	3	886	July 5, 2017
Is it possible to make phrase suggester not return non-existent suggestions? Elasticsearch	2	335	July 6, 2017
Phrase suggester is behaving differently Elasticsearch	1	832	July 5, 2017
Problem with phrase suggestor Elasticsearch	3	576	December 29, 2017

Phrase suggester for not_analyzed?

Related topics