Correct Way to Execute Not Exists and Not Empty Query for Not_Analyzed Fields in Nested Document

Hi,

I have a "not_analyzed" field as part of a nested document in my schema. A sample document is given below

"_source": {
    "identifier": 123,
    .
    .
    "nested_doc_name_xyz":[
      {
        "not_analyzed_field_name_xyz": [
           "ABCD",
           "1234",
           ""
         ]
      }
     ]
    }

To search for documents where not_analyzed_field_name_xyz does not exist and is not empty, the following is the query chunk I am using.

{
  "bool": {
    "filter": {
      "bool": {
        "must_not": {
          "nested": {
            "path": "nested_doc_name_xyz",
            "query": {
              "bool": {
                "must": {
                  "exists": {
                    "field": "nested_doc_name_xyz.not_analyzed_field_name_xyz"
                  }
                },
                "must_not": {
                  "term": {
                    "nested_doc_name_xyz.not_analyzed_field_name_xyz": ""
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Since the sample document I provided earlier, has an empty string as part of the whole value,

"not_analyzed_field_name_xyz": [
           "ABCD",
           "1234",
           ""
         ]

the search query given above is not returning the document with identifier 123.

Ideally, I want to filter out documents which

  1. do not have "nested_doc_name_xyz.not_analyzed_field_name_xyz" this field
    and
  2. those which have value for this field as ("nested_doc_name_xyz.not_analyzed_field_name_xyz": [""] or "nested_doc_name_xyz.not_analyzed_field_name_xyz":"")

Please let me know if you need any more details.

Can you post the index mappings for your sample document?

I believe the query needs to look like:

{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          # nested query satisfying condition 1
        },
        {
          "bool": {
            "should": [
              {
                # nested condition 2 checking for blank value
              },
              {
                # nested condition 2 checking for blank array
              }
            ]
          }
        }
      ]
    }
  }
}

Having the index mappings might be helpful.

Thanks.

@RahulD

Hi Rahul,

Please find the mapping below.

"mappings": { 
  "sample" : {
    "_all" : { 
      "enabled": false
     }, 
    "dynamic" : "strict",
    "properties" : {
      "nested_doc_name_xyz" : { 
        "type" : "nested", 
        "dynamic" : "strict", 
        "properties" : {
          "not_analyzed_field_name_xyz" : { 
            "type" : "string", 
            "index" : "not_analyzed", 
            "doc_values": true 
          } 
        }
      } 
   } 
}

The query you suggested still might not work since as I mentioned earlier, for a document which has the following value

"not_analyzed_field_name_xyz": [
           "ABCD",
           "1234",
           ""
         ]

the following query

must_not": {
  "term": {
    "nested_doc_name_xyz.not_analyzed_field_name_xyz": ""
  }
}

returns true. But I am looking for a query which returns true ONLY when the value for the field's value is just "".

for example,

"not_analyzed_field_name_xyz": [
           ""
         ]

It has an empty string ("") as the only constituent. This document should be true whereas

"not_analyzed_field_name_xyz": [
           "ABCD",
           "1234",
           ""
         ]

This should be false

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.