Highlighting Inconsistent behaviour

Hi

I'm testing highlighting and getting very strange errors on a 3 nodes
cluster. Without highlighting the query runs fine and I get all the
documents I want without failures. When I add highlighting, the same query
returns with failures and this number is fluctuating even though the
documents haven't changed.

The highlighting part of the query is:
"highlight": {
"number_of_fragments": 10,
"fragment_size": 10,
"pre_tags": [""],
"post_tags": [""],
"fields": {
"title": {"number_of_fragments": 0},
"body": {"number_of_fragments": 1}
}
}

"body" isn't a stored field so it's being fetched from "_source" but
"title" is stored.

Running the query with highlighting causes this:

  • failed: 1
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize response
        of type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        TransportSerializationException[Failed to deserialize response of type
        [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200554, maximum is
        855];
        }
        ]

Repeating the same query:

  • failed: 2
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize response
        of type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        TransportSerializationException[Failed to deserialize response of type
        [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 199753, maximum is
        819];
        }
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize response
        of type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        TransportSerializationException[Failed to deserialize response of type
        [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200584, maximum is
        885];
        }
        ]

It's the same set of documents that are failing but I can't guess why!

--

Hey Eric,

can you provide a repeateable example of this like a couple of curl
commands including a sample document? I'd also be interested what version
of ES you are using?

simon

On Thursday, November 15, 2012 10:08:09 AM UTC+1, Eric Kazaki wrote:

Hi

I'm testing highlighting and getting very strange errors on a 3 nodes
cluster. Without highlighting the query runs fine and I get all the
documents I want without failures. When I add highlighting, the same query
returns with failures and this number is fluctuating even though the
documents haven't changed.

The highlighting part of the query is:
"highlight": {
"number_of_fragments": 10,
"fragment_size": 10,
"pre_tags": [""],
"post_tags": [""],
"fields": {
"title": {"number_of_fragments": 0},
"body": {"number_of_fragments": 1}
}
}

"body" isn't a stored field so it's being fetched from "_source" but
"title" is stored.

Running the query with highlighting causes this:

  • failed: 1
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200554, maximum is
        855];
        }
        ]

Repeating the same query:

  • failed: 2
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 199753, maximum is
        819];
        }
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200584, maximum is
        885];
        }
        ]

It's the same set of documents that are failing but I can't guess why!

--

Hey Simon

It's running on version 0.19.9. Here's is the stuff you requested
https://gist.github.com/4077780. Just removed some of the fields in the
sample documents that are irrelevant.

On Thursday, November 15, 2012 11:08:09 AM UTC+2, Eric Kazaki wrote:

Hi

I'm testing highlighting and getting very strange errors on a 3 nodes
cluster. Without highlighting the query runs fine and I get all the
documents I want without failures. When I add highlighting, the same query
returns with failures and this number is fluctuating even though the
documents haven't changed.

The highlighting part of the query is:
"highlight": {
"number_of_fragments": 10,
"fragment_size": 10,
"pre_tags": [""],
"post_tags": [""],
"fields": {
"title": {"number_of_fragments": 0},
"body": {"number_of_fragments": 1}
}
}

"body" isn't a stored field so it's being fetched from "_source" but
"title" is stored.

Running the query with highlighting causes this:

  • failed: 1
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200554, maximum is
        855];
        }
        ]

Repeating the same query:

  • failed: 2
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 199753, maximum is
        819];
        }
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200584, maximum is
        885];
        }
        ]

It's the same set of documents that are failing but I can't guess why!

--

Hmm, I think some important bits of information are missing here. At least,
I cannot reproduce it based on your gist:
https://gist.github.com/9be449c5072fb643fc82 I have a few questions though.

Can you see any corresponding error messages in log files on any of the
nodes?
Can you retrieve one tweet at a time to see if it still fails?
I noticed that the second tweet was written in Japanese but in your example
all japanese characters are replaced with "???????". Not sure where it's
getting broken.
Could you repeat the same query and increase fragment size to 20? Does it
still fail?

On Thursday, November 15, 2012 5:06:25 AM UTC-5, Eric Kazaki wrote:

Hey Simon

It's running on version 0.19.9. Here's is the stuff you requested
https://gist.github.com/4077780. Just removed some of the fields in the
sample documents that are irrelevant.

On Thursday, November 15, 2012 11:08:09 AM UTC+2, Eric Kazaki wrote:

Hi

I'm testing highlighting and getting very strange errors on a 3 nodes
cluster. Without highlighting the query runs fine and I get all the
documents I want without failures. When I add highlighting, the same query
returns with failures and this number is fluctuating even though the
documents haven't changed.

The highlighting part of the query is:
"highlight": {
"number_of_fragments": 10,
"fragment_size": 10,
"pre_tags": [""],
"post_tags": [""],
"fields": {
"title": {"number_of_fragments": 0},
"body": {"number_of_fragments": 1}
}
}

"body" isn't a stored field so it's being fetched from "_source" but
"title" is stored.

Running the query with highlighting causes this:

  • failed: 1
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200554, maximum is
        855];
        }
        ]

Repeating the same query:

  • failed: 2
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 199753, maximum is
        819];
        }
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200584, maximum is
        885];
        }
        ]

It's the same set of documents that are failing but I can't guess why!

--

I've attached the log to the gist
(https://gist.github.com/4077780#file_error%20log). Also double checked my
mapping and all those fields are stored. Could it be related to the fact
that some hits have the wrong encoding?

On Thursday, November 15, 2012 11:08:09 AM UTC+2, Eric Kazaki wrote:

Hi

I'm testing highlighting and getting very strange errors on a 3 nodes
cluster. Without highlighting the query runs fine and I get all the
documents I want without failures. When I add highlighting, the same query
returns with failures and this number is fluctuating even though the
documents haven't changed.

The highlighting part of the query is:
"highlight": {
"number_of_fragments": 10,
"fragment_size": 10,
"pre_tags": [""],
"post_tags": [""],
"fields": {
"title": {"number_of_fragments": 0},
"body": {"number_of_fragments": 1}
}
}

"body" isn't a stored field so it's being fetched from "_source" but
"title" is stored.

Running the query with highlighting causes this:

  • failed: 1
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200554, maximum is
        855];
        }
        ]

Repeating the same query:

  • failed: 2
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 199753, maximum is
        819];
        }
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200584, maximum is
        885];
        }
        ]

It's the same set of documents that are failing but I can't guess why!

--

If it's encoding, it would be nice to be able to reproduce it. Could you do
this:

curl -s "http://your-server:9200/twitter/tweet/7211941?fields=title,body" >
rec.txt

and then post rec.txt somewhere as is, so it has all characters in body
preserved?

On Friday, November 16, 2012 6:03:02 AM UTC-5, Eric Kazaki wrote:

I've attached the log to the gist (
https://gist.github.com/4077780#file_error%20log). Also double checked my
mapping and all those fields are stored. Could it be related to the fact
that some hits have the wrong encoding?

On Thursday, November 15, 2012 11:08:09 AM UTC+2, Eric Kazaki wrote:

Hi

I'm testing highlighting and getting very strange errors on a 3 nodes
cluster. Without highlighting the query runs fine and I get all the
documents I want without failures. When I add highlighting, the same query
returns with failures and this number is fluctuating even though the
documents haven't changed.

The highlighting part of the query is:
"highlight": {
"number_of_fragments": 10,
"fragment_size": 10,
"pre_tags": [""],
"post_tags": [""],
"fields": {
"title": {"number_of_fragments": 0},
"body": {"number_of_fragments": 1}
}
}

"body" isn't a stored field so it's being fetched from "_source" but
"title" is stored.

Running the query with highlighting causes this:

  • failed: 1
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200554, maximum is
        855];
        }
        ]

Repeating the same query:

  • failed: 2
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 199753, maximum is
        819];
        }
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200584, maximum is
        885];
        }
        ]

It's the same set of documents that are failing but I can't guess why!

--

It seems to be related
to https://groups.google.com/d/topic/elasticsearch/Cl0lV4AMszo/discussion
I'm running on AWS and after replacing one node that was running out of
space, I can't reproduce the error anymore.

On Thursday, November 15, 2012 11:08:09 AM UTC+2, Eric Kazaki wrote:

Hi

I'm testing highlighting and getting very strange errors on a 3 nodes
cluster. Without highlighting the query runs fine and I get all the
documents I want without failures. When I add highlighting, the same query
returns with failures and this number is fluctuating even though the
documents haven't changed.

The highlighting part of the query is:
"highlight": {
"number_of_fragments": 10,
"fragment_size": 10,
"pre_tags": [""],
"post_tags": [""],
"fields": {
"title": {"number_of_fragments": 0},
"body": {"number_of_fragments": 1}
}
}

"body" isn't a stored field so it's being fetched from "_source" but
"title" is stored.

Running the query with highlighting causes this:

  • failed: 1
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200554, maximum is
        855];
        }
        ]

Repeating the same query:

  • failed: 2
  • failures: [
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 199753, maximum is
        819];
        }
    • {
      • status: 500
      • reason: RemoteTransportException[Failed to deserialize
        response of type [org.elasticsearch.search.fetch.FetchSearchResult]];
        nested: TransportSerializationException[Failed to deserialize response of
        type [org.elasticsearch.search.fetch.FetchSearchResult]]; nested:
        IndexOutOfBoundsException[Invalid combined index of 200584, maximum is
        885];
        }
        ]

It's the same set of documents that are failing but I can't guess why!

--