How to get array size for each document

ntsh999 · April 5, 2020, 10:42pm

I want the size of an array of objects ('breaches' here) which is present in each document to be returned along with the fields present in the document once the query is executed. How can I do this? Below is my document structure and mapping. I tried with inline script but did not get success. I want name, age, email, contact, total number of breaches to be returned for each document.

{
	"age": 45,
	"email": "",
	"contact": 9899999999,
	"name": "xyx",
	"breaches": [{
		"_id": "5e833cae56cd2e0011ee8e4a",
		"citizen": "5e82e0f5fb4f240012fa9737",
		"breachType": "location_disabled",
		"timestamp": "2020-03-31T12:50:52.828Z",
		"division": null,
		"createdAt": "2020-03-31T12:50:54.371Z"
	}]
}

Mapping

{
	"properties": {
		"age": {
			"type": "long"
		},
		"email": {
			"type": "text",
			"fields": {
				"keyword": {
					"type": "keyword",
					"ignore_above": 256
				}
			}
		},
		"contact": {
			"type": "long"
		},
		"name": {
			"type": "text",
			"fields": {
				"keyword": {
					"type": "keyword",
					"ignore_above": 256
				}
			}
		},
		"breaches": {
			"type": "nested",
			"properties": {
				"_id": {
					"type": "text",
					"fields": {
						"keyword": {
							"type": "keyword",
							"ignore_above": 256
						}
					}
				},
				"breachType": {
					"type": "text",
					"fields": {
						"keyword": {
							"type": "keyword",
							"ignore_above": 256
						}
					}
				},
				"citizen": {
					"type": "text",
					"fields": {
						"keyword": {
							"type": "keyword",
							"ignore_above": 256
						}
					}
				},
				"createdAt": {
					"type": "date"
				},
				"division": {
					"type": "text",
					"fields": {
						"keyword": {
							"type": "keyword",
							"ignore_above": 256
						}
					}
				}
			}
		}
	}
}

Luca_Belluccini · April 5, 2020, 11:08pm

Hello @ntsh999,

The "easiest" solution is using scripted fields, but it is gonna to be slow and consume resources, as you're accessing to the _source:

GET your_index_name/_search
{
  "script_fields": {
    "number_of_breaches": {
      "script": {
        "source": "params['_source'].breaches.length"
      }
    }
  },
  "_source": {
    "excludes": [ "breaches" ]
  }
}
# Response
{
  "took" : 672,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "mytest",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "contact" : 9899999999,
          "name" : "xyx",
          "age" : 45,
          "email" : ""
        },
        "fields" : {
          "number_of_breaches" : [
            1
          ]
        }
      }
    ]
  }
}

The best solution is usually to compute such fields at indexing time, for example using the Ingest Pipelines (documentation):

PUT _ingest/pipeline/count_breaches
{
    "description": "Counts the breaches",
    "processors": [
      {
        "script": {
          "source": "ctx.number_of_breaches = ctx.containsKey('breaches') ? ctx.breaches.length : 0"
        }
      }
    ]
}

The pipeline can be used when indexing the document.

ntsh999 · April 6, 2020, 8:01am

HI @Luca_Belluccini,
Thanks the solution provided by you works absolutely fine.

system · May 4, 2020, 8:01am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Scripting Query for returning cases where array size is larger than X Elasticsearch	3	974	February 11, 2020
How to get nested type array length? Elasticsearch	2	2223	May 9, 2019
Retrieve length of array which is part of object present in map Elasticsearch	0	8	August 15, 2024
Aggregate "array" length's across documents Elasticsearch	2	3845	December 6, 2020
Count array size in all document Elasticsearch	8	3358	May 1, 2021

How to get array size for each document

Related topics