Different Doc count after reindexing

cr_168328 · January 10, 2023, 12:02pm

I have an index user1. To add some fields, I created a new index and did reindexing using the query:

POST /_reindex?scroll=30m
{
  "source": {
    "index": "user1"
  },
  "dest": {
    "index": "user2"
  }
}

After reindexing is complete, GET _cat/indices gives the following result for the 2 indices:

yellow open user1       _tg9nbGWSreKBBsa7E0XwQ      5 1 298934 47108   36.7gb   36.7gb 
yellow open user2       ne1m5aWDR5K064qCo0awKw      5 1 314441     0   25.5gb   25.5gb

But _count query on both the indices give the same value.

Why is Docs count more and Storage size less for the new index during GET _cat/indices ?

dadoonet · January 10, 2023, 4:19pm

No idea. May be you had existing data before? Or you changed the mapping and now are using nested documents?

Because you probably have less segments, no updates, better compression...

Christian_Dahlqvist · January 10, 2023, 4:23pm

Assuming mappings are the same it seems like you have updated and/or deleted documents in the user1 index. This takes up space as old or deleted documents are not immediately removed. I suspect this explains the difference in size and count.

cr_168328 · January 11, 2023, 4:27am

There was no existing data. The index was newly created for reindexing.

Yes. I changed the mapping and created few more fields, which includes nested fields also.

Could you please explain how adding nested fields increase the Docs count (while _count query shows the same count for both indices)?

cr_168328 · January 11, 2023, 4:31am

I haven't updated/deleted any documents.

Mappings are not same. I have updated the mapping and added few more fields including nested fields in the new index. How does it cause the difference in count?

dadoonet · January 11, 2023, 8:10am

Each nested document is indexed as a document in Lucene. That's why you have more documents than in the source index.

system · February 8, 2023, 8:10am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Count of docs after _reindex'ing higher than before Elasticsearch	41	3991	February 15, 2018
Reindex API and slightly different document counts Elasticsearch reindex	2	831	October 8, 2021
Document Count discrepancy after Reindex Elasticsearch	8	4004	August 3, 2017
Larger index size after Elasticsearch reindex Elasticsearch	9	2310	April 12, 2019
Reindex document count does not match the source Elasticsearch	2	581	April 3, 2023

Different Doc count after reindexing

Related topics