Embedded elasticsearch - ClassCastException when clustered

We have a java application which embeds elasticsearch instances on multiple
servers, and starts them as follows:

node = nodeBuilder().client(false).local(false).node();

client = node.client();

//wait for ready

client.admin().cluster().health(newClusterHealthRequest().waitForYellowStatus()).actionGet();

This work fine when started on a single box, and the application is able to
put/get records without issue, and passes benchmarks/stress tests.

When we start the application on multiple nodes, with elasticsearch.yml
configured to allow the nodes to find each other via unicast, the cluster
comes up fine and goes green immediately. However, get requests randomly
start failing fairly quickly with a ClassCastException. Here is the
relevant part of the exception:

2013-05-13 16:37:35,597 ERROR New I/O worker #15 Hazelbridge - Exception in
ServerHandler java.lang.ClassCastException:
org.elasticsearch.common.bytes.ChannelBufferBytesReference cannot be cast
to org.elasticsearch.common.bytes.BytesArray

This exception occurs at the following call:

GetResponse g = client.prepareGet(esIndex, namespace, newString(key)).setFields(
"expiry","data").execute().actionGet();

if (g.isExists()) {

int exp = (int) g.getField("expiry").getValue();
BytesArray b = (BytesArray) g.getField("data").getValue(); //THIS LINE
CAUSES THE EXCEPTION

}

The value of exp is retrieved successfully and it can be logged to verify.

This doesn't happen for the first 3 or 4 puts, but then it starts happening
on 75% of gets. This does not happen in the single node case.

Help?

The mapping for the type is set up with the following code:


String mapping = jsonBuilder()

.startObject()

.startObject(ns)

.startObject("_ttl")

.field("enabled","true")

.endObject()

.startObject("_source")

.field("enabled","false")

.endObject()

.startObject("_all")

.field("enabled", "false")

.endObject()

.startObject("properties")

.startObject("data")

.field("type","binary")

.field("index","not_analyzed")

.field("store","yes")

.field("compress","false")

.endObject()

.startObject("expiry")

.field("type","integer")

.field("index","not_analyzed")

.field("store","yes")

.field("compress","false")

.endObject()

.endObject()

.endObject()

.string();

client.admin().indices().preparePutMapping(esIndex
).setType(ns).setSource(mapping).execute().actionGet();

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Noone?

On Monday, May 13, 2013 4:59:59 PM UTC-4, esadhir wrote:

We have a java application which embeds elasticsearch instances on
multiple servers, and starts them as follows:

node = nodeBuilder().client(false).local(false).node();

client = node.client();

//wait for ready

client.admin().cluster().health(newClusterHealthRequest().waitForYellowStatus()).actionGet();

This work fine when started on a single box, and the application is able
to put/get records without issue, and passes benchmarks/stress tests.

When we start the application on multiple nodes, with elasticsearch.yml
configured to allow the nodes to find each other via unicast, the cluster
comes up fine and goes green immediately. However, get requests randomly
start failing fairly quickly with a ClassCastException. Here is the
relevant part of the exception:

2013-05-13 16:37:35,597 ERROR New I/O worker #15 Hazelbridge - Exception
in ServerHandler java.lang.ClassCastException:
org.elasticsearch.common.bytes.ChannelBufferBytesReference cannot be cast
to org.elasticsearch.common.bytes.BytesArray

This exception occurs at the following call:

GetResponse g = client.prepareGet(esIndex, namespace, newString(key)).setFields(
"expiry","data").execute().actionGet();

if (g.isExists()) {

int exp = (int) g.getField("expiry").getValue();
BytesArray b = (BytesArray) g.getField("data").getValue(); //THIS LINE
CAUSES THE EXCEPTION

}

The value of exp is retrieved successfully and it can be logged to verify.

This doesn't happen for the first 3 or 4 puts, but then it starts
happening on 75% of gets. This does not happen in the single node case.

Help?

The mapping for the type is set up with the following code:


String mapping = jsonBuilder()

.startObject()

.startObject(ns)

.startObject("_ttl")

.field("enabled","true")

.endObject()

.startObject("_source")

.field("enabled","false")

.endObject()

.startObject("_all")

.field("enabled", "false")

.endObject()

.startObject("properties")

.startObject("data")

.field("type","binary")

.field("index","not_analyzed")

.field("store","yes")

.field("compress","false")

.endObject()

.startObject("expiry")

.field("type","integer")

.field("index","not_analyzed")

.field("store","yes")

.field("compress","false")

.endObject()

.endObject()

.endObject()

.string();

client.admin().indices().preparePutMapping(esIndex
).setType(ns).setSource(mapping).execute().actionGet();

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey,

sorry for not getting back. I will try to reproduce your issue tomorrow. Do
you have only entities indexed, which have both fields set?
Also can you provide the full exception from your logfile? Can you tell us
your elasticsearch version as well?

--Alex

On Thu, May 16, 2013 at 7:12 PM, esadhir alokdhir@gmail.com wrote:

Noone?

On Monday, May 13, 2013 4:59:59 PM UTC-4, esadhir wrote:

We have a java application which embeds elasticsearch instances on
multiple servers, and starts them as follows:

node = nodeBuilder().client(false).**local(false).node();

client = node.client();

//wait for ready

client.admin().cluster().health(new ClusterHealthRequest().
waitForYellowStatus()).**actionGet();

This work fine when started on a single box, and the application is able
to put/get records without issue, and passes benchmarks/stress tests.

When we start the application on multiple nodes, with elasticsearch.yml
configured to allow the nodes to find each other via unicast, the cluster
comes up fine and goes green immediately. However, get requests randomly
start failing fairly quickly with a ClassCastException. Here is the
relevant part of the exception:

2013-05-13 16:37:35,597 ERROR New I/O worker #15 Hazelbridge - Exception
in ServerHandler java.lang.ClassCastException: org.elasticsearch.common.*
*bytes.**ChannelBufferBytesReference cannot be cast to
org.elasticsearch.common.**bytes.BytesArray

This exception occurs at the following call:

GetResponse g = client.prepareGet(esIndex, namespace, newString(key)).setFields(
"**expiry","data").execute().**actionGet();

if (g.isExists()) {

int exp = (int) g.getField("expiry").getValue(**);
BytesArray b = (BytesArray) g.getField("data").getValue(); //THIS LINE
CAUSES THE EXCEPTION

}

The value of exp is retrieved successfully and it can be logged to verify.

This doesn't happen for the first 3 or 4 puts, but then it starts
happening on 75% of gets. This does not happen in the single node case.

Help?

The mapping for the type is set up with the following code:


String mapping = jsonBuilder()

.startObject()

.startObject(ns)

.startObject("_ttl")

.field("enabled","true")

.endObject()

.startObject("_source")

.field("enabled","false")

.endObject()

.startObject("_all")

.field("enabled", "false")

.endObject()

.startObject("properties")

.startObject("data")

.field("type","binary")

.field("index","not_analyzed")

.field("store","yes")

.field("compress","false")

.endObject()

.startObject("expiry")

.field("type","integer")

.field("index","not_analyzed")

.field("store","yes")

.field("compress","false")

.endObject()

.endObject()

.endObject()

.string();

client.admin().indices().preparePutMapping(esIndex).
setType(ns).setSource(mapping)**.execute().actionGet();

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey,

first I thought, you found a bug in elasticsearch, but it is actually a bug
in your code. You are casting to the wrong class. You need to cast to
BytesReference instead of BytesArray.

Short explanation long:
BytesReference is the interface used to abstract. BytesArray is used, if
the GetRequest got the data from a shard from the node the query was issued
at (because there is no need to serialize and sent data over the
wire). ChannelBufferBytesReference is used, in case the data was fetched
from a shard from another node in your cluster.

This also explains why you are seeing this exception not all the time, but
only 75% of the time (I guess you have a four node setup then).

Hope this helps and have a nice weekend!

--Alex

On Thu, May 16, 2013 at 7:21 PM, Alexander Reelsen alr@spinscale.de wrote:

Hey,

sorry for not getting back. I will try to reproduce your issue tomorrow.
Do you have only entities indexed, which have both fields set?
Also can you provide the full exception from your logfile? Can you tell us
your elasticsearch version as well?

--Alex

On Thu, May 16, 2013 at 7:12 PM, esadhir alokdhir@gmail.com wrote:

Noone?

On Monday, May 13, 2013 4:59:59 PM UTC-4, esadhir wrote:

We have a java application which embeds elasticsearch instances on
multiple servers, and starts them as follows:

node = nodeBuilder().client(false).**local(false).node();

client = node.client();

//wait for ready

client.admin().cluster().health(new ClusterHealthRequest().
waitForYellowStatus()).**actionGet();

This work fine when started on a single box, and the application is able
to put/get records without issue, and passes benchmarks/stress tests.

When we start the application on multiple nodes, with elasticsearch.yml
configured to allow the nodes to find each other via unicast, the cluster
comes up fine and goes green immediately. However, get requests randomly
start failing fairly quickly with a ClassCastException. Here is the
relevant part of the exception:

2013-05-13 16:37:35,597 ERROR New I/O worker #15 Hazelbridge - Exception
in ServerHandler java.lang.ClassCastException: org.elasticsearch.common.
**bytes.**ChannelBufferBytesReference cannot be cast to
org.elasticsearch.common.**bytes.BytesArray

This exception occurs at the following call:

GetResponse g = client.prepareGet(esIndex, namespace, newString(key)).setFields(
"**expiry","data").execute().**actionGet();

if (g.isExists()) {

int exp = (int) g.getField("expiry").getValue(**);
BytesArray b = (BytesArray) g.getField("data").getValue(); //THIS LINE
CAUSES THE EXCEPTION

}

The value of exp is retrieved successfully and it can be logged to
verify.

This doesn't happen for the first 3 or 4 puts, but then it starts
happening on 75% of gets. This does not happen in the single node case.

Help?

The mapping for the type is set up with the following code:


String mapping = jsonBuilder()

.startObject()

.startObject(ns)

.startObject("_ttl")

.field("enabled","true")

.endObject()

.startObject("_source")

.field("enabled","false")

.endObject()

.startObject("_all")

.field("enabled", "false")

.endObject()

.startObject("properties")

.startObject("data")

.field("type","binary")

.field("index","not_analyzed")

.field("store","yes")

.field("compress","false")

.endObject()

.startObject("expiry")

.field("type","integer")

.field("index","not_analyzed")

.field("store","yes")

.field("compress","false")

.endObject()

.endObject()

.endObject()

.string();

client.admin().indices().preparePutMapping(esIndex).
setType(ns).setSource(mapping)**.execute().actionGet();

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thank you for your response - I'll test and get back.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

The fix worked great - thank you!

On Friday, May 31, 2013 1:10:26 PM UTC-4, esadhir wrote:

Thank you for your response - I'll test and get back.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

New issue with this - same environment. After several days of working fine,
clustered, using the (BytesReference) cast you suggested, it starts failing
again - but alternating (see below).

Inexplicably, after some period of time - usually days, the values we're
storing, which are binary, start failing with:

Aug 15 15:56:04 localhost.localdomain ERROR: Hazelbridge - Exception in
ServerHandler java.lang.ClassCastException: java.lang.String cannot be cast
to org.elasticsearch.common.bytes.BytesReference
Aug 15 15:56:04 localhost.localdomain ERROR: Hazelbridge -
..com.symplicity.hazelbridge.backends.elasticsearch.ElasticsearchInstance.get(ElasticsearchInstance.java:163)

That occurs at this line:

BytesReference b = (BytesReference) g.getField("data").getValue();

The value at this point has somehow been changed into a string, even though
it was originally put in as binary, and verified via the mapping api.

This is further verified by pulling this data via the HTTP API. Usually
when pulling binary records, the binary data is not returned and you simply
see "exists: true" in the return json. After the "corruption" has occurred,
it returns the base 64 encoded binary string. Internally somehow the stored
data has changed types.

Interestingly, when this happens, the "expiry" field in the same record has
also spontaneously changed from an int to a long, even though in our code
we never treat that value as a long. These two conditions always occur on
the same record - i.e. once the bug shows up, both fields will have
switched to the wrong data type. This is an example of that occurring:

Aug 15 15:56:04 localhost.localdomain ERROR: Hazelbridge - Exception in
ServerHandler java.lang.ClassCastException: java.lang.Long cannot be cast
to java.lang.Integer
Aug 15 15:56:04 localhost.localdomain ERROR: Hazelbridge -
..com.symplicity.hazelbridge.backends.elasticsearch.ElasticsearchInstance.get(ElasticsearchInstance.java:152)

Also, sometimes, it actually cycles between the wrong type and the right in
subsequent get requests - even to the same cluster node.

If I had to guess, something is amiss in shard migration perhaps - the
record's associated mapping is somehow lost on one of the nodes. Guessing
this since it only happens after a while of running.

Any help would be appreciated.

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

bump

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I am seeing exact same behavior as you described. Wondering if you ever figured out the problem.

Caused by: ! java.lang.ClassCastException: java.lang.String cannot be cast to org.elasticsearch.common.bytes.BytesReference

The offending line of code is:

BytesArray bytesArray = ((BytesReference)resp
.getField( IndexSchema.BLOOM_FILTER)
.getValue())
.toBytesArray();

The BLOOK_FILTER field is defined as binary type. Not sure why query returned String type.

I am running ElasticSearch Version 1.3.4.

Thanks,

Kirk