Hello! I realized a search application with bibliographic data in ES 2.4.4.
I testet different approaches of autocomplete:
- completion suggest
- aggregation
My data (examples):
{"id":"1","terms":["austen","jane","rauchenberger","margarete"],"payload":["Austen, Jane","Rauchenberger, Margarete"],"suggest":{"input":["austen","jane","rauchenberger","margarete"]}}
{"id":"2","terms":["austen","jane","rauchenberger","margarete","thirkell","angela"],"payload":["Austen, Jane","Rauchenberger, Margarete","Thirkell, Angela"],"suggest":{"input":["austen","jane","rauchenberger","margarete","thirkell","angela"]}}
{"id":"3","terms":["austen","jane","rauchenberger","margarete","bowen","elizabeth"],"payload":["Austen, Jane","Rauchenberger, Margarete","Bowen, Elizabeth"],"suggest":{"input":["austen","jane","rauchenberger","margarete","bowen","elizabeth"]}}
{"id":"4","terms":["austen","jane","kr\u00e4mer","ilse"],"payload":["Austen, Jane","Kr\u00e4mer, Ilse"],"suggest":{"input":["austen","jane","kr\u00e4mer","ilse"]}}
{"id":"5","terms":["jane","austen","mozart"],"payload":"Jane Austen and Mozart","suggest":{"input":["jane","austen","mozart"]}}
My index: /mysuggest/title/
{
"settings": {
"analysis": {
"filter": {
"edgeNGram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 25,
"side": "front"
},
"custom_ascii_folding": {
"type": "asciifolding",
"preserve_original": true
}
},
"analyzer": {
"edge_nGram_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"edgeNGram_filter",
"custom_ascii_folding"
]
},
"custom_suggest": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"standard",
"custom_ascii_folding"
]
}
}
}
},
"mappings": {
"title": {
"properties": {
"id": {
"type": "string",
"index": "not_analyzed"
},
"terms": {
"type": "string",
"index": "not_analyzed"
},
"payload": {
"type": "string",
"fields": {
"autocomplete": {
"type": "string",
"analyzer": "edge_nGram_analyzer",
"search_analyzer": "standard"
},
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"suggest": {
"type": "completion",
"analyzer": "custom_suggest",
"search_analyzer": "standard",
"payloads": false
}
}
}
}
}
- First approach completion suggest:
The disired output of hits is an array of objects as implemented in ES Version < 5.x:
{text: "xxx", score: 1}
This has changed in Version >= 5.x
Query:
http://localhost:9200/mysuggest/title/_search
{
"size": 0,
"suggest": {
"term-suggest": {
"text": "a",
"completion": {
"field": "suggest",
"size": "20"
}
}
}
}
Result:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 0,
"hits": [ ]
},
"suggest": {
"term-suggest": [
{
"text": "a",
"offset": 0,
"length": 1,
"options": [
{
"text": "angela",
"score": 1
}
,
{
"text": "austen",
"score": 1
}
]
}
]
}
}
Question: How can I achieve this with ES 6?
- Second approach aggreagations:
The problem with aggregations is, that I get results of non matching terms:
Query:
http://localhost:9200/mysuggest/title/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"terms": "austen"
}
}
]
}
},
"aggregations": {
"top_terms": {
"terms": {
"field": "payload.raw",
"size": 10,
"order": {
"_count": "desc"
}
}
}
}
}
Result:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 0,
"hits": [ ]
},
"aggregations": {
"top_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Austen, Jane",
"doc_count": 4
}
,
{
"key": "Rauchenberger, Margarete",
"doc_count": 3
}
,
{
"key": "Bowen, Elizabeth",
"doc_count": 1
}
,
{
"key": "Jane Austen and Mozart",
"doc_count": 1
}
,
{
"key": "Krämer, Ilse",
"doc_count": 1
}
,
{
"key": "Thirkell, Angela",
"doc_count": 1
}
]
}
}
}
I only want these results in the bucket because only these match with "austen":
{"key": "Austen, Jane", "doc_count": 4}
{ "key": "Jane Austen and Mozart","doc_count": 1}
I've tryed filtering the query ... same result.
The only way to get the desired result set is to programmatically filter out unwanted results.
Is there any solution?
See this online: http://rxs.bibliothecauniversalis.net/ with 4 optional settings.
Thanks alot for your help.
Kind regards
JP Weiner