A previously working ES 9.0 query may sometimes produce a ReduceSearchPhaseException after adding new data


(Jack Key) #1

Has anyone else noticed an occasional ReduceSearchPhaseException when querying stored data?
Specifically, my query on user-defined stored fields works fine at first, but tends to break after data is inserted.
This is puzzling because nothing is logged by elasticsearch concerning any sort of error during the data upload or the queries themselves. Could the data upload invalidate an existing, previously working query? The error is difficult to reproduce consistently, because it seems to break arbitrarily.More details are below:

I uploaded via REST into an ES 9.0 cluster (deployed on EC2 with ec2_discovery and s3_gateway).

After PUTting a mapping for my type "biocompare/halcyon" and inserting 3 records, I was able to execute the following search from a terminal successfully, producing good results:

curl -XPUT cloud:9200/biocompare/exampletype/_mapping -d ' { "exampletype" : { "properties" : { "item_name" : { "type" : "multi_field", "fields" : { "default" : { "type" : "string", "store" : "no", "index" : "analyzed" }, "facet" : { "type": "string", "store" : "yes", "index" : "not_analyzed" } } },

"price" : { "type" : "multi_field", "fields" : { "default" : { "type" : "float", "store" : "no", "index" : "analyzed" }, "facet" : { "type": "string", "store" : "yes", "index" : "not_analyzed" } } },

"Description" : { "type" : "multi_field", "fields" : { "default" : { "type" : "string", "store" : "no", "index" : "analyzed" }, "facet" : { "type": "string", "store" : "yes", "index" : "not_analyzed" } } }

  }

}
}
'

</MAPPING COMMAND>

$ curl -XPOST cloud:9200/biocompare/exampletype/_search -d ' { "from" : 0, "size" : 1, "fields" : ["_id","price.facet"], "sort" : { "Description.facet" : { } }, "query" : { "match_all" : { } }, "facets" : { "vendor" : { "terms" : { "field" : "price.facet", "size" : 100, "analyzer" : "none" }, "global" : false } } } ' {"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":null,"hits":[{"_index":"biocompare","_type":"exampletype","_id":"657","_score":null}]},"facets":{"vendor":{"_type":"terms","_field":"price.facet","terms":[]}}}}}

However, after inserting just a few more records, the same query produces a ReduceSearchPhraseException, which is a bad result:

{"error":"ReduceSearchPhaseException[Failed to execute phase [query], [reduce] ]; nested: "}

How can this be? The same query against the same node seems to be broken after inserting additional records!
Is it possible to break the query just by adding new records to the data store?

Although each new record is acknowledged as "ok" in the JSON response, I discover that at runtime that queries using "sort" and/or "fields" over user-mapped stored fields will fail. This may happen after an insert (the 1st or 3001st -- it is not consistent). Using the same script to insert the data, in the same order, I get this failure at different points. The logs, which have been turned on for all evens I know about (action:DEBUG, gateway:DEBUG, index.shard.recovery:DEBUG) aren't producing anything for this error.

However, using the default stored fields (e.g. "_id", and "_source") are still fine and still continue to return good results.

Does this look familiar to anyone? Also, please let me know and I will quickly provide any additional info you might need on the environment, logs, data uploaded, etc. to help understand the nature of my concern.

Regards, Jack Key


(Shay Banon) #2

Answered on the other thread. Damn google groups with its spam filter...

On Thu, Jul 29, 2010 at 11:52 PM, Jack Key joeandrewkey@gmail.com wrote:

Has anyone else noticed an occasional ReduceSearchPhaseException when
querying stored data?
Specifically, my query on user-defined stored fields works fine at first,
but tends to break after data is inserted.
This is puzzling because nothing is logged by elasticsearch concerning any
sort of error during the data upload or the queries themselves. Could the
data upload invalidate an existing, previously working query? The error is
difficult to reproduce consistently, because it seems to break
arbitrarily.More details are below:

I uploaded via REST into an ES 9.0 cluster (deployed on EC2 with
ec2_discovery and s3_gateway).

After PUTting a mapping for my type "biocompare/halcyon" and inserting 3
records, I was able to execute the following search from a terminal
successfully, producing good results:

curl -XPUT cloud:9200/biocompare/halcyon/_mapping -d ' { "halcyon" : { "properties" : { "item_name" : { "type" : "multi_field", "fields" : { "default" : { "type" : "string", "store" : "no", "index" : "analyzed" }, "facet" : { "type": "string", "store" : "yes", "index" : "not_analyzed" } } },

"vendor" : { "type" : "multi_field", "fields" : { "default" : { "type" :
"string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },

"modified" : { "omit_term_freq_and_positions" : true, "index_name" :
"modified", "index" : "not_analyzed", "omit_norms" : true, "store" : "no",
"boost" : 1.0, "format" : "dateOptionalTime", "precision_step" : 4,
"term_vector" : "no", "type" : "date" },

"price" : { "type" : "multi_field", "fields" : { "default" : { "type" :
"float", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },

"Antigen" : { "type" : "multi_field", "fields" : { "default" : { "type" :
"string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },

"Antigen Species" : { "type" : "multi_field", "fields" : { "default" : {
"type" : "string", "store" : "no", "index" : "analyzed" }, "facet" : {
"type": "string", "store" : "yes", "index" : "not_analyzed" } } },

"Antigen Synonyms" : { "type" : "multi_field", "fields" : { "default" : {
"type" : "string", "store" : "no", "index" : "analyzed" }, "facet" : {
"type": "string", "store" : "yes", "index" : "not_analyzed" } } },

"Catalog Number" : { "type" : "multi_field", "fields" : { "default" : {
"type" : "string", "store" : "no", "index" : "analyzed" }, "facet" : {
"type": "string", "store" : "yes", "index" : "not_analyzed" } } },

"Concentration" : { "type" : "multi_field", "fields" : { "default" : {
"type" : "string", "store" : "no", "index" : "analyzed" }, "facet" : {
"type": "string", "store" : "yes", "index" : "not_analyzed" } } },

"Conjugate" : { "type" : "multi_field", "fields" : { "default" : { "type" :
"string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },

"Description" : { "type" : "multi_field", "fields" : { "default" : { "type"
: "string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },

"Form" : { "type" : "multi_field", "fields" : { "default" : { "type" :
"string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },
"Host Species" : { "type" : "multi_field", "fields"
: { "default" : {
"type" : "string", "store" : "no", "index" : "analyzed" }, "facet" : {
"type": "string", "store" : "yes", "index" : "not_analyzed" } } },

"Immunogen" : { "type" : "multi_field", "fields" : { "default" : { "type" :
"string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },

"Isotype" : { "type" : "multi_field", "fields" : { "default" : { "type" :
"string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },

"Quantity" : { "type" : "multi_field", "fields" : { "default" : { "type" :
"string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },

"Reactivity" : { "type" : "multi_field", "fields" : { "default" : { "type"
:
"string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },

"References" : { "type" : "multi_field", "fields" : { "default" : { "type"
:
"string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } },

"Type" : { "type" : "multi_field", "fields" : { "default" : { "type" :
"string", "store" : "no", "index" : "analyzed" }, "facet" : { "type":
"string", "store" : "yes", "index" : "not_analyzed" } } }

           }
   }

}
'
</MAPPING COMMAND>

$ curl -XPOST cloud:9200/biocompare/halcyon/_search -d ' { "from" : 0, "size" : 1, "sort" : ["Description.facet"], "query" : { "match_all" : { } }} '

{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":3,"max_score":null,"hits":[{"_index":"biocompare","_type":"halcyon","_id":"50682","_score":null,"fields":{"item_name.facet":"Goat
Anti-Rabbit IgG, Horseradish Peroxidase Conjugated"}}]}}
</GOOD RESULTS>

However, after inserting just a few more records, the same query produces a
ReduceSearchPhraseException, which is a bad result:

{"error":"ReduceSearchPhaseException[Failed to execute phase [query], [reduce] ]; nested: "}

How can this be? The same query against the same node seems to be broken
after inserting additional records!
Is it possible to break the query just by adding new records to the data
store?

Although each new record is acknowledged as "ok" in the JSON response, I
discover that at runtime that queries using "sort" and/or "fields" over
user-mapped stored fields will fail. This may happen after an insert (the
1st or 3001st -- it is not consistent). Using the same script to insert
the
data, in the same order, I get this failure at different points. The logs,
which have been turned on for all evens I know about (action:DEBUG,
gateway:DEBUG, index.shard.recovery:DEBUG) aren't producing anything for
this error.

However, using the default stored fields (e.g. "_id", and "_source") are
still fine and still continue to return good results.

Does this look familiar to anyone? Also, please let me know and I will
quickly provide any additional info you might need on the environment,
logs,
data uploaded, etc. to help understand the nature of my concern.

Regards, Jack Key

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/A-previously-working-ES-9-0-query-may-sometimes-produce-a-ReduceSearchPhaseException-after-adding-nea-tp1004896p1004896.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(system) #3