Let's assume my index's data looks something like this:
[
{
"_index": "doc_pages",
"_type": "pages",
"_id": "264b406593732ffbd15a5ed4b4e3b5af",
"_score": 1,
"_source": {
"record_id": "264b406593732ffbd15a5ed4b4e3b5af",
"title": "Page Title 1",
"title_exact": "Page Title 1",
"versions": [
{
"page_id": "fbaf1d5c-dff0-42bb-a41b-3248f0d115d0",
"page_slug": "page-title-1",
"page_content": "Page Title 1 You can test whether the instances were configured properly. Refer to Page Title 1.",
"document_id": "0cfc5f29-ad3e-11e8-8784-00505692583a",
"document_title": "Document Title 1",
"document_slug": "document-title-1",
"software_version": "5.6.3"
},
{
"page_id": "d6d717e2-1868-4f6d-b93d-e42d15f68058",
"page_slug": "connectivity-test",
"page_content": "Page Title 1 You can test whether the instances were configured properly. Refer to Page Title 1.",
"document_id": "012fca84-1911-11e9-b86b-00505692583a",
"document_title": "Document Title 1",
"document_slug": "document-title-1",
"software_version": "6.0.0"
}
]
}
},
{
"_index": "doc_pages",
"_type": "pages",
"_id": "c47b24da3e68e2c854ed8bc0436e7384",
"_score": 1,
"_source": {
"record_id": "c47b24da3e68e2c854ed8bc0436e7384",
"title": "Introduction",
"title_exact": "Introduction",
"versions": [
{
"page_id": "6329c7fb-a1f1-4fd9-ae78-0db702c411f5",
"page_slug": "introduction",
"page_content": "Introduction You can configure with two instances using Highly Available Virtual IP, which is configurable on the platform.",
"document_id": "0cfc5f29-ad3e-11e8-8784-00505692583a",
"document_title": "Document Title 1",
"document_slug": "document-title-1",
"software_version": "5.6.3"
},
{
"page_id": "c091675c-7e82-4533-a78e-688770be7ce7",
"page_slug": "introduction",
"page_permanent_id": "647589",
"page_content": "Introduction You can configure with two instances using Highly Available Virtual IP (HAVIP), which is configurable on the platform.",
"document_id": "012fca84-1911-11e9-b86b-00505692583a",
"document_title": "Document Title 1",
"document_slug": "document-title-1",
"software_version": "6.0.0"
}
]
}
}
]
And my query currently looks like this
{
"size": 30,
"query": {
"bool": {
"minimum_should_match": 1,
"filter": {
"nested": {
"path": "versions",
"query": [
{
"term": {
"versions.document_id": "0cfc5f29-ad3e-11e8-8784-00505692583a"
}
}
]
}
},
"should": [
{
"dis_max": {
"queries": [
{
"term": {
"title_exact": "HAVIP"
}
},
{
"query_string": {
"fields": [
"title"
],
"query": "HAVIP"
}
},
{
"nested": {
"path": "versions",
"query": {
"query_string": {
"fields": [
"versions.page_content"
],
"query": "HAVIP"
}
}
}
}
]
}
}
]
}
}
}
The problem that I face right now is that record id c47b24da3e68e2c854ed8bc0436e7384
will show up in my results. That's the complaint. ES found the string reliably, and the filter is correct in that this Document ID was apart of it.
What we're trying to resolve now is to make the search a bit less forgiving, and make things a bit more explicit. So now, the situation is that the query can only exist inside the version object that matches the provided document id, while performing a similar query.
At this point, I'm not quite sure how to get ElasticSearch to care about only looking into a very specific sub object based off of a term match
Any help on this matter would be very much appreciated.
ES version is 6.3.2
Cheers!