cluster can't recover after upgrade from 1.1.1 to 1.3.2 due to MaxBytesLengthExceededException

After doing a rolling upgrade from 1.1.1 to 1.3.2 some shards are failing
to recover.
I have two nodes with 8 shards and 1 replica. The index is a daily rolling
index, after the upgrade, the old indices recovered fine. The error is only
happening in today's index. I didn't stop indexing during the upgrade. From
the stack trace below this seems that I have reached the maximum limit for
a unanalyzed field; but this field's length is always greater than 32766. I
search lucene open bugs in 4.9 but didn't find anything.
my main concern now is how to recover the cluster without losing the shards
that are failing to start? also will this limit always be enforced, and why
it just started showing up now?

Here is the full stack trace of the exception:
Enter code here...[2014-09-10 18:39:03,045][WARN ][indices.cluster
] [qldbtrindex1.qa.cyveillance.com] [transient_2014_09_10][7] failed to
start shard

org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[transient_2014_09_10][7] failed to recover shard

at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:269)

at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:722)

Caused by: java.lang.IllegalArgumentException: Document contains at least
one immense term in field="providerEntity" (whose UTF8 encoding is longer
than the max length 32766), all of which were skipped. Please correct the
analyzer to not produce such terms. The prefix of the first immense term
is: '[123, 34, 119, 105, 107, 105, 112, 101, 100, 105, 97, 34, 58, 123, 34,
101, 120, 116, 101, 114, 110, 97, 108, 108, 105, 110, 107, 115, 34,
58]...', original message: bytes can be at most 32766 in length; got 249537

at
org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:671)

at
org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:342)

at
org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:301)

at
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:222)

at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:450)

at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1507)

at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1222)

at
org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:563)

at
org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:492)

at
org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:769)

at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:250)

... 4 more

Caused by:
org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes
can be at most 32766 in length; got 249537

at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:284)

at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:151)

at
org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:645)
... 14 more

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/326a43b5-62aa-4d60-a73e-77605d736242%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

You are running into this
problem: http://elasticsearch-users.115913.n3.nabble.com/encoding-is-longer-than-the-max-length-32766-td4056738.html

You need to change the mapping and define a maximum token length in your
analyzer. Unfortunately, you would need to do that before you migrate and I
don't think you'll be able to fix the shards without this mapping change in
place.

Jilles

On Thursday, September 11, 2014 12:46:46 AM UTC+2, omar wrote:

After doing a rolling upgrade from 1.1.1 to 1.3.2 some shards are failing
to recover.
I have two nodes with 8 shards and 1 replica. The index is a daily rolling
index, after the upgrade, the old indices recovered fine. The error is only
happening in today's index. I didn't stop indexing during the upgrade. From
the stack trace below this seems that I have reached the maximum limit for
a unanalyzed field; but this field's length is always greater than 32766.
I search lucene open bugs in 4.9 but didn't find anything.
my main concern now is how to recover the cluster without losing the
shards that are failing to start? also will this limit always be enforced,
and why it just started showing up now?

Here is the full stack trace of the exception:
Enter code here...[2014-09-10 18:39:03,045][WARN ][indices.cluster
] [qldbtrindex1.qa.cyveillance.com] [transient_2014_09_10][7] failed
to start shard

org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[transient_2014_09_10][7] failed to recover shard

at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:269)

at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:722)

Caused by: java.lang.IllegalArgumentException: Document contains at least
one immense term in field="providerEntity" (whose UTF8 encoding is longer
than the max length 32766), all of which were skipped. Please correct the
analyzer to not produce such terms. The prefix of the first immense term
is: '[123, 34, 119, 105, 107, 105, 112, 101, 100, 105, 97, 34, 58, 123, 34,
101, 120, 116, 101, 114, 110, 97, 108, 108, 105, 110, 107, 115, 34,
58]...', original message: bytes can be at most 32766 in length; got 249537

at
org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:671)

at
org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:342)

at
org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:301)

at
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:222)

at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:450)

at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1507)

at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1222)

at
org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:563)

at
org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:492)

at
org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:769)

at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:250)

... 4 more

Caused by:
org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes
can be at most 32766 in length; got 249537

at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:284)

at
org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:151)

at
org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:645)
... 14 more

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ba34a600-2d60-4421-9b2d-e9506c0f08f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Thank you Jilles, this really helped. I have actually changed the mapping
of that field to NO_INDEX to avoid this problem all together. But it is
nice to know there is a valid solution out there.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f147bdae-04ec-4d40-9ac7-e11ac1c1339e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Has anyone got a query that allows me to count/aggregate how many immense fields might be existing in an index?

Also another caveat in my case is that my immense fields occur in nested documents, so many ways to interrogate field/value size don't work against nested fields. My scenario is also an upgrade issue where reindexing would be much much slower than just upgrading in place.

So far the closest I've got so far is to regex the value of the nested field to find docs with x amount of chars or more to identify the largest values. Pushing this query to larger value checks eventually results in a stack overflow due to how expensive regex is.

{
"query": {
"bool": {
"must": [
{ "match_all": {}},
{
"nested": {
"path": "tags",
"filter": {
"regexp": {
"value": {
"value": ".{2000,}"
}
}
}
}
}
]
}
}

}