Reindex conflicts


#1

Hi guys,

When using /_reindex, wasn't "conflicts": "proceed" supposed to prevent the proccess from aborting?

curl -XPOST http://187.41.XXX.XX:9200/_reindex?pretty -d'{
  "conflicts": "proceed",
  "source": {
    "index": "dull-2016.12.01"
  },
  "dest": {
    "index": "the_dull-2016.12.01"
  }
}'

Output

{
  "took" : 768304,
  "timed_out" : false,
  "total" : 13745504,
  "updated" : 0,
  "created" : 9072999,
  "deleted" : 0,
  "batches" : 9073,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [
    {
      "index" : "the_dull-2016.12.01",
      "type" : "logs",
      "id" : "AVi6lWRXLAkCrdn_k-wj",
      "cause" : {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse [parsed_tracking.data.data.uid]",
        "caused_by" : {
          "type" : "number_format_exception",
          "reason" : "For input string: \"ABC\""
        }
      },
      "status" : 400
    }
  ]
}

It looks like that there's 4672505 documents missing.

Any help?

TIA


(Nik Everett) #2

Aborting on version conflicts. This isn't a version conflict. It looks like the parsed_tracking.data.data.uid field is a number in the mapping but is a string in some documents. This usually happens because you have both numbers (like, without quotes around them) in the documents and those arrive first so Elasticsearch infers the type of the field to be number.

For reindex it is usually best to create the destination index, mapping and all, before starting the process.


#3

Before using reindex I've created a mapping for the new dest indexes.

I thought that "proceed" could bypass those cases that doesn't match the field type.

Thank you


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.