Generally a very similar thread is Aggregation on suggestions results, but it doesn't really get to a good solution.
Let's go for the first approach and we can get almost the previous behavior.
First mapping and sample docs for easy reproduction; I used 7.0 here, but this should work on 6.x just the same way. Note that in the mapping only the suggest
field is relevant and everything else could be skipped:
PUT test
{
"settings": {
"number_of_shards": 1,
"analysis": {
"filter": {
"edgeNGram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 25,
"side": "front"
},
"custom_ascii_folding": {
"type": "asciifolding",
"preserve_original": true
}
},
"analyzer": {
"edge_nGram_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"edgeNGram_filter",
"custom_ascii_folding"
]
},
"custom_suggest": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"standard",
"custom_ascii_folding"
]
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"id": {
"type": "keyword"
},
"terms": {
"type": "keyword"
},
"payload": {
"type": "text",
"fields": {
"autocomplete": {
"type": "text",
"analyzer": "edge_nGram_analyzer",
"search_analyzer": "standard"
},
"raw": {
"type": "keyword"
}
}
},
"suggest": {
"type": "completion"
}
}
}
}
}
PUT test/_doc/1
{
"id": "1",
"terms": [
"austen",
"jane",
"rauchenberger",
"margarete"
],
"payload": [
"Austen, Jane",
"Rauchenberger, Margarete"
],
"suggest": {
"input": [
"austen",
"jane",
"rauchenberger",
"margarete"
]
}
}
PUT test/_doc/2
{
"id": "2",
"terms": [
"austen",
"jane",
"rauchenberger",
"margarete",
"thirkell",
"angela"
],
"payload": [
"Austen, Jane",
"Rauchenberger, Margarete",
"Thirkell, Angela"
],
"suggest": {
"input": [
"austen",
"jane",
"rauchenberger",
"margarete",
"thirkell",
"angela"
]
}
}
PUT test/_doc/3
{
"id": "3",
"terms": [
"austen",
"jane",
"rauchenberger",
"margarete",
"bowen",
"elizabeth"
],
"payload": [
"Austen, Jane",
"Rauchenberger, Margarete",
"Bowen, Elizabeth"
],
"suggest": {
"input": [
"austen",
"jane",
"rauchenberger",
"margarete",
"bowen",
"elizabeth"
]
}
}
PUT test/_doc/4
{
"id": "4",
"terms": [
"austen",
"jane",
"krämer",
"ilse"
],
"payload": [
"Austen, Jane",
"Krämer, Ilse"
],
"suggest": {
"input": [
"austen",
"jane",
"krämer",
"ilse"
]
}
}
PUT test/_doc/5
{
"id": "5",
"terms": [
"jane",
"austen",
"mozart"
],
"payload": "Jane Austen and Mozart",
"suggest": {
"input": [
"jane",
"austen",
"mozart"
]
}
}
And then the query is:
GET test/_search
{
"_source": false,
"suggest": {
"term-suggest": {
"prefix": "a",
"completion": {
"field": "suggest",
"skip_duplicates": true
}
}
}
}
Which gets you the result (only the suggest
part):
"term-suggest" : [
{
"text" : "a",
"offset" : 0,
"length" : 1,
"options" : [
{
"text" : "angela",
"_index" : "test",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0
},
{
"text" : "austen",
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0
}
]
}
]
The important parts are:
-
prefix
query, since I assume we need to start with the right letter(s) to get to any results. -
skip_duplicates
to have every completion term only once. This renders the_id
field pretty useless since it could be multiple IDs but we are only returning one. -
"_source": false
to avoid getting the actual documents back.