Weird highlighting error ... failed to highlight field ... String index out of range

I cannot make head or tail of this error, and its happening pretty randomly
to where I don't even know where to start looking.

This is what the full error looks like

Tire::Search::SearchRequestFailed: 500 :
{"error":"SearchPhaseExecutionException[Failed to execute phase
[query_fetch], total failure; shardFailures
{[7McitJnjQkqLkViqUpZUyw][content][4]:
FetchPhaseExecutionException[[content][4]: query[+_all:account +_all:set
+_all:up],from[0],size[20]: Fetch Failed [Failed to highlight field
[post_content]]]; nested: StringIndexOutOfBoundsException[String index out
of range: -5]; }]","status":500}

A query like
"relationship learning"
will run fine, but running
"relationship centered learning"
will throw the error, actually any of these letters c, d, j, q, x, z used
with "relationship learning" .. like "d relationship learning" will throw
the error.

Its truly maddening.

I'm running elasticsearch 19.2 with Tire
I just want to know where to start looking, any ideas will help.

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

can you see any errors in the log files that give more information about
what happens here? or can you make this reproducible with a selfcontained
gist?

simon

On Monday, April 1, 2013 8:07:32 AM UTC+2, concept47 wrote:

I cannot make head or tail of this error, and its happening pretty
randomly to where I don't even know where to start looking.

This is what the full error looks like

Tire::Search::SearchRequestFailed: 500 :
{"error":"SearchPhaseExecutionException[Failed to execute phase
[query_fetch], total failure; shardFailures
{[7McitJnjQkqLkViqUpZUyw][content][4]:
FetchPhaseExecutionException[[content][4]: query[+_all:account +_all:set
+_all:up],from[0],size[20]: Fetch Failed [Failed to highlight field
[post_content]]]; nested: StringIndexOutOfBoundsException[String index out
of range: -5]; }]","status":500}

A query like
"relationship learning"
will run fine, but running
"relationship centered learning"
will throw the error, actually any of these letters c, d, j, q, x, z used
with "relationship learning" .. like "d relationship learning" will throw
the error.

Its truly maddening.

I'm running elasticsearch 19.2 with Tire
I just want to know where to start looking, any ideas will help.

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Its too random to reproduce, and I'm trying to track down what could be
wrong.
The error occurs on random queries, but once those queries error out, they
always error out.

Nothing in Elasticsearch logs. In fact this is all thats in there.

[2013-04-01 05:20:37,638][INFO ][org.elasticsearch.node ] [Arclight]
{0.19.2}[30832]: initializing ...
[2013-04-01 05:20:37,758][INFO ][org.elasticsearch.plugins] [Arclight]
loaded , sites [mapper-attachments, cloud-aws]
[2013-04-01 05:20:41,208][INFO ][org.elasticsearch.node ] [Arclight]
{0.19.2}[30832]: initialized
[2013-04-01 05:20:41,208][INFO ][org.elasticsearch.node ] [Arclight]
{0.19.2}[30832]: starting ...
[2013-04-01 05:20:41,309][INFO ][org.elasticsearch.transport] [Arclight]
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/10.0.92.233:9300]}
[2013-04-01 05:20:44,367][INFO ][org.elasticsearch.cluster.service]
[Arclight] new_master
[Arclight][Hc-s8CDOSuq6iN21XM0TAg][inet[/10.0.92.233:9300]], reason:
zen-disco-join (elected_as_master)
[2013-04-01 05:20:44,425][INFO ][org.elasticsearch.discovery] [Arclight]
elasticsearch-production/Hc-s8CDOSuq6iN21XM0TAg
[2013-04-01 05:20:44,453][INFO ][org.elasticsearch.http ] [Arclight]
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/10.0.92.233:9200]}
[2013-04-01 05:20:44,453][INFO ][org.elasticsearch.node ] [Arclight]
{0.19.2}[30832]: started
[2013-04-01 05:20:47,454][INFO ][org.elasticsearch.gateway] [Arclight]
recovered [3] indices into cluster_state

On Monday, April 1, 2013 4:19:00 AM UTC-5, simonw wrote:

can you see any errors in the log files that give more information about
what happens here? or can you make this reproducible with a selfcontained
gist?

simon

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Here's a gist of a guy wwho was having the same exact problem, complete
with an elasticsearch stack trace

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Here's my stack trace (finally figured out how to get it)


thanks for the help

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

this seems to be a Lucene issue, I have a lucene testcase that reproduces
this problem. it seems to be caused by
this: [LUCENE-4899] FastVectorHighlihgter fails with SIOOB if single phrase or term is > fragCharSize - ASF JIRA

can you raise frag size to see if it goes away?

simon
On Tuesday, April 2, 2013 12:50:28 AM UTC+2, concept47 wrote:

Here's my stack trace (finally figured out how to get it)
stack trace of elasticsearch highlighting error/bug · GitHub
thanks for the help

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

We can't, because specific ux interactions in our app depend on that
particular restriction.
What we will do drop the use of fast vector highlighter, then watch that
lucene ticket to see when it gets resolved and eventually makes its way
into elasticsearch, before turning it back on.

thank you so very much for all your help Simon.
This was driving me absolutely crazy.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Another question, where can I report this bug?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

hey,

so I can try to put this into a lucene test but I guess this is rather an
ngram issue than a highlighter issue.
Can you open an issue on github (elasticsearch) and I take it from there?

simon

On Thursday, April 4, 2013 7:13:40 AM UTC+2, concept47 wrote:

Another question, where can I report this bug?

Elasticsearch highlighting on ngram filter is weird if min_gram is set to 1 - Stack Overflow

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Done. https://github.com/elasticsearch/elasticsearch/issues/2839
Thank you!

Ike

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey Simon,
I just discovered that term_vector takes the following as arguments, yes,
with_offsets, with_positions in addition to with_position_offsets.
Could using one of the other arguments (not with_positions_offsets) avoid
this exceptions?

On Wednesday, April 3, 2013 4:27:00 AM UTC-5, simonw wrote:

this seems to be a Lucene issue, I have a lucene testcase that reproduces
this problem. it seems to be caused by this:
[LUCENE-4899] FastVectorHighlihgter fails with SIOOB if single phrase or term is > fragCharSize - ASF JIRA

can you raise frag size to see if it goes away?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Simon.
It seems you got this fixed in
Lucene. [LUCENE-4899] FastVectorHighlihgter fails with SIOOB if single phrase or term is > fragCharSize - ASF JIRA
Any idea when it will go into Elasticsearch?

On Wednesday, April 3, 2013 2:27:00 AM UTC-7, simonw wrote:

this seems to be a Lucene issue, I have a lucene testcase that reproduces
this problem. it seems to be caused by this:
[LUCENE-4899] FastVectorHighlihgter fails with SIOOB if single phrase or term is > fragCharSize - ASF JIRA

can you raise frag size to see if it goes away?

simon
On Tuesday, April 2, 2013 12:50:28 AM UTC+2, concept47 wrote:

Here's my stack trace (finally figured out how to get it)
stack trace of elasticsearch highlighting error/bug · GitHub
thanks for the help

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.