Count of items matched in array in elasticsearch aggregation

I need to get count of matched elements in array from elasticsearch. here is my data model

{
"ProcessInstance":
    [
        {
            "_status": "InProgress",
            "AssignedTo":
                [
                    {
                        "_id": "flobot",
                        "Name": "flobot",

                        "Kind": "User"
                    }
                ]
        },
        {
            "_status": "InProgress",
            "AssignedTo":
                [
                    {
                        "_id": "sangeeth",
                        "Name": "sangeeth",

                        "Kind": "User"
                    }
                ]
        }
    ]

}

My query : {"ProcessInstance._status": "InProgress", "ProcessInstance.AssignedTo._id": "sangeeth"}

Expected output:
Count: 1

To do such a thing you probably need to use nested documents.

But why not indexing instead every single ProcessInstance as one document? Instead of an array of ProcessInstance.

This is my business requirement. I have to store array like this. How do i use nested mapping for this?

Ok. Have a look at https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html

Still am not getting actual count, this is my mapping

My mapping:

     {
"mappings": {
    "Process": {
        "properties": {
            "ProcessInstance": {
                "type": "nested",
                "properties": {
                        "_status": {"type": "string"},
                        "AssignedTo": {
                            "type": "nested",
                            "properties": {
                                "_id": {"type": "string"}
                            }
                        }
                }
            }


        }
    }
}

}

My Query:

{
"query": {
    "bool": {
        "must": [
            {
                "nested": {
                    "path": "ProcessInstance",
                    "query": {
                        "term": {
                            "ProcessInstance._status.keyword": "InProgress"
                        }
                    }
                }
            },
            {
                "nested": {
                    "path": "ProcessInstance",
                    "query": {
                        "term": {
                            "ProcessInstance.AssignedTo._id": "sangeeth"
                        }
                    }
                }
            }
        ]
    }
},
"aggs": {
    "ProcessInstance": {
        "nested": {
            "path": "ProcessInstance"
        },
        "aggs": {
            "_status": {
                "terms": {
                    "field": "ProcessInstance._status.keyword"
                }
            }
        }
    }
}

}

It returns nothing. Help me to get the matched count from array. I worried about this issue for past one week

Could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.

Thank you for your reply. Here is the recreation script

DELETE index001
PUT index001
{
    "mappings": {
        "Process": {
            "properties": {
                "ProcessInstance": {
                    "type": "nested",
                    "properties": {
                        "_status": {"type": "string"},
                        "AssignedTo": {
                            "type": "nested",
                            "properties": {
                                "_id": {"type": "string"}
                            }
                        }
                    }
                }


            }
        }
    }
}
PUT index001/Process/1/_update?refresh=true
{
"Name": "PrTest",
"Step": "First",
"ProcessInstance":
    [
        {
            "_status": "InProgress",
            "AssignedTo":
                [
                    {
                        "_id": "flobot",
                        "Name": "flobot",

                        "Kind": "User"
                    }
                ]
        },
        {
            "_status": "InProgress",
            "AssignedTo":
                [
                    {
                        "_id": "sangeeth",
                        "Name": "sangeeth",

                        "Kind": "User"
                    }
                ]
        }
    ]
}
GET index001/_search
{
"query": {
    "bool": {
        "must": [
            {
                "nested": {
                    "path": "ProcessInstance",
                    "query": {
                        "term": {
                            "ProcessInstance._status.keyword": "InProgress"
                        }
                    }
                }
            },
            {
                "nested": {
                    "path": "ProcessInstance",
                    "query": {
                        "term": {
                            "ProcessInstance.AssignedTo._id": "sangeeth"
                        }
                    }
                }
            }
        ]
    }
},
"aggs": {
    "ProcessInstance": {
        "nested": {
            "path": "ProcessInstance"
        },
        "aggs": {
            "_status": {
                "terms": {
                    "field": "ProcessInstance._status.keyword"
                }
            }
        }
    }
}
}

Let's focus first on the query part.
It can not match because AssignedTo is also a nested document.
It can't be accessed directly with:

  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "ProcessInstance",
            "query": {
              "term": {
                "ProcessInstance._status.keyword": "InProgress"
              }
            }
          }
        },
        {
          "nested": {
            "path": "ProcessInstance",
            "query": {
              "term": {
                "ProcessInstance.AssignedTo._id": "sangeeth"
              }
            }
          }
        }
      ]
    }
  }

Also .keyword field does not exist in your mapping.

One way to make the query return something is:

GET index001/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "ProcessInstance",
            "query": {
              "term": {
                "ProcessInstance._status": "inprogress"
              }
            }
          }
        },
        {
          "nested": {
            "path": "ProcessInstance.AssignedTo",
            "query": {
              "term": {
                "ProcessInstance.AssignedTo._id": "sangeeth"
              }
            }
          }
        }
      ]
    }
  }
}
1 Like

This recreation steps gives my exact count match from array

  DELETE index001
  PUT index001
  {
      "mappings": {
          "Process": {
              "properties": {
                  "ProcessInstance": {
                      "type": "nested"
                  }


              }
          }
      }
  }
  PUT index001/Process/1
  	{
  	"Name": "PrTest",
  	"Step": "Second",
  	"_status": "InProgress",
  	"ProcessInstance":
  	    [
  	        {
  	            "_status": "Completed",
  	            "AssignedTo":
  	                [
  	                    {
  	                        "_id": "sangeeth",
  	                        "Name": "sangeeth",

  	                        "Kind": "User"
  	                    }
  	                ]
  	        },
  	        {
  	            "_status": "InProgress",
  	            "AssignedTo":
  	                [
  	                    {
  	                        "_id": "sangeeth",
  	                        "Name": "sangeeth",

  	                        "Kind": "User"
  	                    }
  	                ]
  	        }
  	    ]
  	}
  GET index001/_search
  {
    "size": 0,
     "query": {
        "bool": {
           "must": {
              "match": {
                 "_status.keyword": "InProgress"
              }
           },
           "filter": {
              "nested": {
                 "path": "ProcessInstance",
                 "query": {
                   "bool":{"must":[{
                    "match": {
                       "ProcessInstance.AssignedTo._id": "sangeeth"
                    }
                   },
                   {
                    "match": {
                       "ProcessInstance._status.keyword": "InProgress"
                    }
                   }]
                   }
                 }
              }
           }
        }
     },
     "aggs": {
        "nested_events": {
           "nested": {
              "path": "ProcessInstance"
           },
           "aggs": {
              "filtered_events": {
                 "filter": {
                   "bool":{"must":[{
                    "term": {
                       "ProcessInstance.AssignedTo._id": "sangeeth"
                    }
                   },
                   {
                    "term": {
                       "ProcessInstance._status.keyword": "InProgress"
                    }
                   }]
                   }
                 }
              }
           }
        }
     }
  }

Returns:

{
  ....
  "aggregations": {
    "nested_events": {
      "doc_count": 2,
      "filtered_events": {
        "doc_count": 1
      }
    }
  }
}

Count looks fine now.
Is there any way to return only the matched item as result hits?
Expected output:

   {
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "index001",
        "_type": "Process",
        "_id": "2",
        "_score": 0.2876821,
        "_source": {
          "Name": "PrTest",
          "Step": "Second",
          "_status": "InProgress",
          "ProcessInstance": [
            {
              "_status": "InProgress",
              "AssignedTo": [
                {
                  "_id": "sangeeth",
                  "Name": "sangeeth",
                  "Kind": "User"
                }
              ]
            }
          ]
        }
      }
      
    ]
  },
  "aggregations": {
    "nested_events": {
      "doc_count": 2,
      "filtered_events": {
        "doc_count": 1
      }
    }
  }
}

You need to use nested for AssignedTo and then probably use https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.