Docs count on migration - difference between `_count` and `_stats`

Hi.
We're migrating from an on-premise Elasticsearch 7.17.10 to an Elastic-cloud 7.17.14 one. We are migrating using esm.

Everything seems fine, but we noticed some high-volume indexes have a _count very different from a) the number of documents managed by esm b) the total number of docs returned by _stats in total > docs > count. The latter, in the source environment has something like 380000 docs, while _count and esm report ~41521. When migrated, on the destination both _count and _stats return ~41521.

Searching around, I learnt _stats counts also nested documents. I have some indexes with nested docs and, apparently, they've been copied correctly.

So I'm puzzled to understand where this huge difference in docs count lies. And if, by any chance, if I can be confident all my data is being copied correctly. :grin:

Thanks

Check if the mapping is the same.

It's almost the same... It should be, but there are small differences: some fields that were mapped as float are text on the new installation, a very small bunch of fields missing in the new.
I know nothing about ES, but looks strange that these mapping differences could lead to such a magnitude gap between the two. Also because there are much more documents on the old, rather than on the new.

Side request: how can I obtain a _count list with all the indexes? From what I could understand, this is an ES-only endpoint, thus it cannot be queried from outside. I can do it on the on-prem install, but I didn't understand how to do the same with the elastic-cloud managed instance.

Why? You did not apply the same mapping?

Per index?

GET /*/_search
{
  "track_total_hits": true, 
  "size": 0,
  "aggs": {
    "index": {
      "terms": {
        "field": "_index"
      }
    }
  }
}

Or globally?

GET /*/_count

Indexes were created with the wrong mapping, fixed it.

As said in the first post, I learnt _search and _count can return different results. I was looking for a way to obtain a list of all indexes with related _count. I haven't found the correct endpoint for this last call on our elastic-cloud instance.

Not sure about this. But if you run a _refresh before that should be the same values.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.