Document Count discrepancy after Reindex

Today I tried reindex in v5.4.3. The document count dropped from 6.4k in the source index to 1.6k in the destination index.

Is this expected behavior for 5.4.x? I've been using document count as validation for reindexing for v5.3.2, should I be using another data point for validation? :confused:

Thanks in advance!!

Rich

Source Index:

Destination Index:

Run Remote Reindex:

POST _reindex?wait_for_completion=true
{
  "source": {
    "remote": {
      "host": "http://Host01:9200"
    },
    "index": "images-v3"
  },
  "dest": {
    "index": "images-v3"
  }
}

Results Remote Reindex:

{
  "took": 1320,
  "timed_out": false,
  "total": 1615,
  "updated": 0,
  "created": 1615,
  "deleted": 0,
  "batches": 2,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1,
  "throttled_until_millis": 0,
  "failures": []
}

That's weird.

Can you run on both clusters:

 _cat/indices?v
 _cat/shards?v

Have you made any changes to mappings for the index, e.g. removing the use of nested documents?

1 Like

@dadoonet Exactly!! Cool I'm not crazy. :slight_smile:
@Christian_Dahlqvist - I'm not managing the indices, but I know the team is using parent/child documents. Not sure how much the've messed with the mappings. The source is the original Dev Env. I'm trying to reindex into a QA Env.

_cat/indices/images-v3?v

Source:

health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   images-v3 4z51-msXSkasgkfrzTlAyw   5   1       6194            0      9.7mb          9.7mb

Destination:

health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   images-v3 rwcE2PC5T4WiWKsramDLNw   5   1       1615            0      9.5mb          9.5mb

_cat/shards/images-v3?v

Source Cluster:

	index     shard prirep state      docs store ip            node
	images-v3 3     p      STARTED    1219 1.8mb 172.17.38.210 dev_elastic_stack_contentclound_5_4_0-01
	images-v3 3     r      UNASSIGNED                          
	images-v3 2     p      STARTED    1249 1.7mb 172.17.38.210 dev_elastic_stack_contentclound_5_4_0-01
	images-v3 2     r      UNASSIGNED                          
	images-v3 1     p      STARTED    1292   2mb 172.17.38.210 dev_elastic_stack_contentclound_5_4_0-01
	images-v3 1     r      UNASSIGNED                          
	images-v3 4     p      STARTED    1212 1.9mb 172.17.38.210 dev_elastic_stack_contentclound_5_4_0-01
	images-v3 4     r      UNASSIGNED                          
	images-v3 0     p      STARTED    1222   2mb 172.17.38.210 dev_elastic_stack_contentclound_5_4_0-01
	images-v3 0     r      UNASSIGNED

Destination:

index     shard prirep state      docs store ip            node
images-v3 1     p      STARTED     345   2mb 172.17.16.250 qa_elastic_stack_contentclound_5_4_3-01
images-v3 1     r      UNASSIGNED                          
images-v3 2     p      STARTED     312 1.7mb 172.17.16.250 qa_elastic_stack_contentclound_5_4_3-01
images-v3 2     r      UNASSIGNED                          
images-v3 3     p      STARTED     301 1.8mb 172.17.16.250 qa_elastic_stack_contentclound_5_4_3-01
images-v3 3     r      UNASSIGNED                          
images-v3 4     p      STARTED     327 1.8mb 172.17.16.250 qa_elastic_stack_contentclound_5_4_3-01
images-v3 4     r      UNASSIGNED                          
images-v3 0     p      STARTED     330 2.1mb 172.17.16.250 qa_elastic_stack_contentclound_5_4_3-01
images-v3 0     r      UNASSIGNED

Can you get the mappings for the 2 indices as well using the get mapping API?

Source Index Mapping:

{
  "images-v3": {
    "mappings": {
      "image": {
        "properties": {
          "brand_id_faves": {
            "type": "keyword"
          },
          "brand_ids": {
            "type": "keyword"
          },
          "date": {
            "type": "date"
          },
          "diffbotUri": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "docId": {
            "type": "long"
          },
          "estimatedDate": {
            "type": "date"
          },
          "gburl": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "height": {
            "type": "long"
          },
          "humanLanguage": {
            "type": "keyword"
          },
          "lastCrawlTimeUTC": {
            "type": "long"
          },
          "naturalHeight": {
            "type": "long"
          },
          "naturalWidth": {
            "type": "long"
          },
          "pageUrl": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "parentUrlDocId": {
            "type": "long"
          },
          "parent_doc_ids": {
            "type": "keyword"
          },
          "primary": {
            "type": "boolean"
          },
          "siteName": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "tags": {
            "type": "nested",
            "properties": {
              "count": {
                "type": "long"
              },
              "label": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "rdfTypes": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "score": {
                "type": "float"
              },
              "uri": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "text": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "timestamp": {
            "type": "date"
          },
          "title": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "type": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "url": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "width": {
            "type": "long"
          },
          "xpath": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

Destination Index Mapping:

{
  "images-v3": {
    "mappings": {
      "image": {
        "properties": {
          "brand_ids": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "date": {
            "type": "date"
          },
          "diffbotUri": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "docId": {
            "type": "long"
          },
          "estimatedDate": {
            "type": "date"
          },
          "gburl": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "height": {
            "type": "long"
          },
          "humanLanguage": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "lastCrawlTimeUTC": {
            "type": "long"
          },
          "naturalHeight": {
            "type": "long"
          },
          "naturalWidth": {
            "type": "long"
          },
          "pageUrl": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "parentUrlDocId": {
            "type": "long"
          },
          "parent_doc_ids": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "primary": {
            "type": "boolean"
          },
          "siteName": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "tags": {
            "properties": {
              "count": {
                "type": "long"
              },
              "label": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "rdfTypes": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "score": {
                "type": "float"
              },
              "uri": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "text": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "timestamp": {
            "type": "date"
          },
          "title": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "type": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "url": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "width": {
            "type": "long"
          },
          "xpath": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

@Christian_Dahlqvist @dadoonet Thanks your help!!

Mapping issue - :disappointed::disappointed::disappointed::disappointed:

I didn't even think about mappings!! I'm usually working with index templates.

Created Index with Mappings

PUT images-v3-maptest
{
    "mappings": {
      "image": {
        "properties": {
      .....
    }
}

Reindex
POST _reindex
{
"source": {
"index": "images-v3"
},

"dest": {
  "index": "images-v3-maptest"
}
}

Results

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.