How to create a completion suggester index without duplicates?

Hi,

I want to organize a simple index of various search terms which gets updated frequently by the users of my search api. For this purpose I created the following mapping

{
    "mappings": {
    "_doc" : {
        "properties" : {
            "term" : {
                "type" : "completion",
                "analyzer": "simple",
	            "fields": {
			        "length": {
		                 "type": "token_count",
              	         "analyzer": "standard"
                         }
	               }
            }
        }
    }
  }
}

and these test documents:

{
    "term":  "test"
}

{
    "term":  "wiki test"
}

{
    "term":  "fast wiki test"
}

My problems are revolving around the issue that by my definition a duplicate is present whenever the term field consist of the exact same tokens, so in the example above I have three distinct documents. This is the reason I added the token_count field in the mapping.

Unfortunately if I want to search with

{ 
"suggest": {
	"querysuggest": {
		"prefix": "test",
		"completion" : {
			"field": "term"
		}
	}
}
}

I dont see a way to add the term.length filter. Plus- prefix works only well if i use the input-field at indexing time but this crashes my token_count mapping:

{
"term" : {
	"input" : [ "superfast", "wiki" ]
}
}

Resulting this error:

{
"error": {
	"root_cause": [
		{
			"type": "mapper_parsing_exception",
			"reason": "failed to parse [term.length]"
		}
	],
	"type": "mapper_parsing_exception",
	"reason": "failed to parse [term.length]",
	"caused_by": {
		"type": "illegal_state_exception",
		"reason": "Can't get text on a END_OBJECT at 4:2"
	}
},
"status": 400
 }

So what can I do to get a search term index which only holds unique sets of tokens / string arrays?

Regards,

thurse

Figured it out myself, the point is to use the keyword field just as in the documentation (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html). Now I can search for duplicates via keyword and a term-query but still get the suggestions from the completion field. The only problem is that the order of the search terms still matters, but thats a thing i can live with for now.

Regards,

thurse

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.