Bytes can be at most 32766 in length

Hello World!

I'm trying to _reindex one of my indices, yet I'm running into following issue:

{
  "took": 9502,
  "timed_out": false,
  "total": 114143,
  "updated": 0,
  "created": 2999,
  "deleted": 0,
  "batches": 3,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1,
  "throttled_until_millis": 0,
  "failures": [
    {
      "index": "dest",
      "type": "doc",
      "id": "cmBeZW4BYpT_rNQS9_De",
      "cause": {
        "type": "illegal_argument_exception",
        "reason": "Document contains at least one immense term in field=\"post.keyword\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[-48, -99, -48, -80, -47, -125, -47, -121, -48, -67, -48, -66, 45, -48, -72, -47, -127, -47, -127, -48, -69, -48, -75, -48, -76, -48, -66, -48, -78, -48]...', original message: bytes can be at most 32766 in length; got 67247",
        "caused_by": {
          "type": "max_bytes_length_exceeded_exception",
          "reason": "max_bytes_length_exceeded_exception: bytes can be at most 32766 in length; got 67247"
        }
      },
      "status": 400
    }
  ]
}

part of dest/_mapping

  "post" : {
    "type" : "text",
    "fields" : {
      "keyword" : {
        "type" : "keyword",
        "ignore_above" : 70656
      }
    }
  },

Please advise.

I guess you did not define the index for the destination index and you are then using the default mapping.

Thank you for quick reply!

I hit submit by accident without finishing writing my question fully) I now updated my initial question with relevant part of custom mapping that I use for my index which is called "dest".

What does this field contain? Why do you need a keyword datatype for this field?

html

I'm thinking now type keyword probably not the best way to store html...

Definitely not. What is the use case for this field ?

requirements of app is to store full html, in case of wrong parsing and needs to be reparse again

I'd use a text field with index: false. See https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-index.html

2 Likes

that's exactly what I need)

thank you!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.