How to compute the average number of indexed documents/sec from an ES query?

Hello,

I would like to construct a query to get the @timestamp of the first and last indexed documents and from that to compute the average number of documents indexed per second in this interval.

I know I can use _stats to compute this metric but I don't get the same number of indexed documents with _stats than with _cat/count. The former being the correct value.

How could I correctly compute this value ?

Thanks.

[Edit] After some testings this seems to not be reliable. So use with caution.

I am answering myself as in fact it was very simple... Here a bash script to compute the average number of documents indexed :

es_url="$1"
index="$2"
t1=$(curl -sX POST "${es_url}/${index}/_search" -H 'Content-Type: application/json' -d'
{
    "from" : 0, "size" : 1,
    "query": {
        "match_all": {}
    },
   "sort" : [
      {"@timestamp" : {"order" : "asc", "mode" : "min"}}
   ]
}
' | jq '.hits.hits[0].sort[0]')

t2=$(curl -sX POST "${es_url}/${index}/_search" -H 'Content-Type: application/json' -d'
{
    "from" : 0, "size" : 1,
    "query": {
        "match_all": {}
    },
   "sort" : [
      {"@timestamp" : {"order" : "desc", "mode" : "max"}}
   ]
}
' | jq '.hits.hits[0].sort[0]')

count=$(curl -sX GET "${es_url}/_cat/count/${index}" | cut -d ' ' -f 3)
time=$(bc <<< "$t2 - $t1")
avg=$(echo "$count / $time * 1000" | bc)

echo $avg

If anyone has a simpler or more efficient solution...

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.