Hello,
I have a nested mapping as follows:
{
"fbs": {
"mappings": {
"institution": {
"properties": {
"document": {
"type": "nested",
"properties": {
"document_id": {
"type": "long"
},
"expiration_date": {
"type": "date"
},
"flags": {
"type": "text",
"norms": false,
"analyzer": "comma"
},
"is_active": {
"type": "boolean"
},
"is_current": {
"type": "boolean"
},
"name": {
"type": "text"
},
"section": {
"type": "nested",
"properties": {
"created_at": {
"type": "date"
},
"data": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"file": {
"type": "nested",
"properties": {
"author": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content": {
"type": "text",
"analyzer": "standard",
"search_analyzer": "fbs_search_analyzer"
},
"content_length": {
"type": "long"
},
"content_type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"date": {
"type": "date"
},
"keywords": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"language": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"filename": {
"type": "text",
"norms": false
},
"fingerprint": {
"type": "text",
"norms": false
},
"flags": {
"type": "text",
"norms": false,
"analyzer": "comma"
},
"is_active": {
"type": "boolean"
},
"name": {
"type": "text"
},
"section_id": {
"type": "long"
},
"updated_at": {
"type": "date"
}
}
},
"start_date": {
"type": "date"
}
}
},
"institution_id": {
"type": "long"
},
"name_en": {
"type": "text",
"norms": false
},
"name_fr": {
"type": "text",
"norms": false
},
"region_id": {
"type": "long"
}
}
}
}
}
}
I am currently searching on document.section.file.content
for example to pull documents out that match my keywords. However, what I want to do is limit the results to only one match (the highest relevancy) per unique document_id.
I.e. I have a single logical "document" that is represented by 10 PDF "sections", I am searching those sections but I only want to return the top result per document (i.e. limit 1 result per document_id)
Is there a way I can coax ES into doing this or am I better off just post-processing the results?
John