Match only if query is found in a field found within a specific nested object, and nothing else

Let's assume my index's data looks something like this:

[
  {
    "_index": "doc_pages",
    "_type": "pages",
    "_id": "264b406593732ffbd15a5ed4b4e3b5af",
    "_score": 1,
    "_source": {
      "record_id": "264b406593732ffbd15a5ed4b4e3b5af",
      "title": "Page Title 1",
      "title_exact": "Page Title 1",
      "versions": [
        {
          "page_id": "fbaf1d5c-dff0-42bb-a41b-3248f0d115d0",
          "page_slug": "page-title-1",
          "page_content": "Page Title 1 You can test whether the instances were configured properly. Refer to Page Title 1.",
          "document_id": "0cfc5f29-ad3e-11e8-8784-00505692583a",
          "document_title": "Document Title 1",
          "document_slug": "document-title-1",
          "software_version": "5.6.3"
        },
        {
          "page_id": "d6d717e2-1868-4f6d-b93d-e42d15f68058",
          "page_slug": "connectivity-test",
          "page_content": "Page Title 1 You can test whether the instances were configured properly. Refer to Page Title 1.",
          "document_id": "012fca84-1911-11e9-b86b-00505692583a",
          "document_title": "Document Title 1",
          "document_slug": "document-title-1",
          "software_version": "6.0.0"
        }
      ]
    }
  },
  {
    "_index": "doc_pages",
    "_type": "pages",
    "_id": "c47b24da3e68e2c854ed8bc0436e7384",
    "_score": 1,
    "_source": {
      "record_id": "c47b24da3e68e2c854ed8bc0436e7384",
      "title": "Introduction",
      "title_exact": "Introduction",
      "versions": [
        {
          "page_id": "6329c7fb-a1f1-4fd9-ae78-0db702c411f5",
          "page_slug": "introduction",
          "page_content": "Introduction You can configure with two instances using Highly Available Virtual IP, which is configurable on the platform.",
          "document_id": "0cfc5f29-ad3e-11e8-8784-00505692583a",
          "document_title": "Document Title 1",
          "document_slug": "document-title-1",
          "software_version": "5.6.3"
        },
        {
          "page_id": "c091675c-7e82-4533-a78e-688770be7ce7",
          "page_slug": "introduction",
          "page_permanent_id": "647589",
          "page_content": "Introduction You can configure with two instances using Highly Available Virtual IP (HAVIP), which is configurable on the platform.",
          "document_id": "012fca84-1911-11e9-b86b-00505692583a",
          "document_title": "Document Title 1",
          "document_slug": "document-title-1",
          "software_version": "6.0.0"
        }
      ]
    }
  }
]

And my query currently looks like this

{
  "size": 30,
  "query": {
    "bool": {
      "minimum_should_match": 1,
      "filter": {
        "nested": {
          "path": "versions",
          "query": [
            {
              "term": {
                "versions.document_id": "0cfc5f29-ad3e-11e8-8784-00505692583a"
              }
            }
          ]
        }
      },
      "should": [
        {
          "dis_max": {
            "queries": [
              {
                "term": {
                  "title_exact": "HAVIP"
                }
              },
              {
                "query_string": {
                  "fields": [
                    "title"
                  ],
                  "query": "HAVIP"
                }
              },
              {
                "nested": {
                  "path": "versions",
                  "query": {
                    "query_string": {
                      "fields": [
                        "versions.page_content"
                      ],
                      "query": "HAVIP"
                    }
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}

The problem that I face right now is that record id c47b24da3e68e2c854ed8bc0436e7384 will show up in my results. That's the complaint. ES found the string reliably, and the filter is correct in that this Document ID was apart of it.

What we're trying to resolve now is to make the search a bit less forgiving, and make things a bit more explicit. So now, the situation is that the query can only exist inside the version object that matches the provided document id, while performing a similar query.

At this point, I'm not quite sure how to get ElasticSearch to care about only looking into a very specific sub object based off of a term match

Any help on this matter would be very much appreciated.

ES version is 6.3.2

Cheers!

Alright cool, so "impossible" is what I'm hearing? :wink:

I'm afraid I didn't really get what you're looking for.

Given that input, which would be your desired output?
Would you like only the nested object to be returned? This one I mean:

{
  "page_id": "6329c7fb-a1f1-4fd9-ae78-0db702c411f5",
  "page_slug": "introduction",
  "page_content": "Introduction You can configure with two instances using Highly Available Virtual IP, which is configurable on the platform.",
  "document_id": "0cfc5f29-ad3e-11e8-8784-00505692583a",
  "document_title": "Document Title 1",
  "document_slug": "document-title-1",
  "software_version": "5.6.3"
}

Also, I'm not really sure that query might even work as you say since you're looking for a nested object in path versions when versions has no nested object in it (it has only an array of objects, which is quite different).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.