NEST suddenly starting to create incorrect requests

We have a .NET application using NEST version 2.3.1 against Elasticsearch 2.2.0. The application is a backend server application running on multiple hosts under IIS, all performing the same queries against an Elasticsearch cluster.

Recently we experienced very strange behavior at one of our customer sites running the application on 4 IIS hosts. Suddenly all queries against Elasticsearch from one of the nodes started to fail, the other 3 nodes continued to work fine.

The error reported from Elasticsearch for queries from the failing node was the following:

StatusCode: 400, Method: POST, Url: https://--the-url--/_search?timeout=5s, Elasticsearch reason: 
[term] query does not support [has_child], Exception: System.Net.WebException: The remote server returned an error: (400) Bad Request.
   at System.Net.HttpWebRequest.GetResponse()
   at Elasticsearch.Net.HttpConnection.Request[TReturn](RequestData requestData)

After enabling verbose logging of the request on the application side we found that we actually did send the following query in the request:

{"query":{"bool":{"filter":[{"term":{"the_property_name":{"has_child":"4"}}}]}}}

This is obviously an incorrect query, so the Elasticsearch response is correct. The query should look like this (with "value" instead of "has_child"):

{"query":{"bool":{"filter":[{"term":{"the_property_name":{"value":"4"}}}]}}}

From the C# code of the application we just perform the following for the term condition using the NEST client:

Query<T>.Term("the_property_name", "4")

We experienced the exact same thing with not only term queries, but also regexp queries. It appears that the NEST client suddenly started writing incorrect requests using "has_child" instead of "value" as the property name for the condition value. After restarting the process (by performing an IIS recycle of the application pool) the problem disappeared.

I have browsed the source code of NEST/Elasticsearch.Net, looking at how the serialization of queries is performed. Including trying to find e.g. thread unsafety anywhere in FieldNameQueryJsonConverter, ReserializeJsonConverter and the extension method GetCachedObjectProperties. But I have not found anything that looks incorrect.

Has anyone experienced anything similar? Any idea on what could cause this other than some sort of memory corruption?

Just some more details from my continued investigation. We took a memory dump before performing the recycling, and I have now analyzed it.

NEST has a static cache (TypeExtensions._cachedTypeProperties) which contains a mapping from types (like e.g. TermQuery) to a list of JsonProperty objects, each describing a property of the type. Among other things the JsonProperty contains PropertyName (which is the name used in the serialized json, typically set with a JsonProperty attribute) and UnderlyingName (which is the name of backing field/property in the type). The TermQuery class implements the ITermQuery interface:

public interface ITermQuery : IFieldNameQuery
{
	[JsonProperty("value")]
	object Value { get; set; }
}

Meaning that the JsonProperty for the Value should have PropertyName="value" and UnderlyingName="Value".

But listing all JsonProperty objects for the TermQuery in WinDbg gives the following:

[0] 000000040ef31110 (Newtonsoft.Json.Serialization.JsonProperty)
    _propertyName:000000070f1bb9f8 (System.String) Length=13, String="conditionless"
    <UnderlyingName>k__BackingField:000000050f0bffa0 (System.String) Length=13, String="Conditionless"
[1] 000000040ef31238 (Newtonsoft.Json.Serialization.JsonProperty)
    _propertyName:000000070f1bbaa0 (System.String) Length=5, String="_name"
    <UnderlyingName>k__BackingField:000000060eebd5c8 (System.String) Length=4, String="Name"
[2] 000000040ef31360 (Newtonsoft.Json.Serialization.JsonProperty)
    _propertyName:000000070f1bbb38 (System.String) Length=5, String="boost"
    <UnderlyingName>k__BackingField:000000060eebd5f0 (System.String) Length=5, String="Boost"
[3] 000000040ef31488 (Newtonsoft.Json.Serialization.JsonProperty)
    _propertyName:000000070f0d07b8 (System.String) Length=10, String="isVerbatim"
    <UnderlyingName>k__BackingField:000000050f0bffd8 (System.String) Length=10, String="IsVerbatim"
[4] 000000040ef315b0 (Newtonsoft.Json.Serialization.JsonProperty)
    _propertyName:000000070f0d0478 (System.String) Length=8, String="isStrict"
    <UnderlyingName>k__BackingField:000000060eebd618 (System.String) Length=8, String="IsStrict"
[5] 000000040ef316d8 (Newtonsoft.Json.Serialization.JsonProperty)
    _propertyName:000000070f0d0618 (System.String) Length=10, String="isWritable"
    <UnderlyingName>k__BackingField:000000060eebd648 (System.String) Length=10, String="IsWritable"
[6] 000000040ef31800 (Newtonsoft.Json.Serialization.JsonProperty)
    _propertyName:000000070f1bbca8 (System.String) Length=5, String="field"
    <UnderlyingName>k__BackingField:000000070f1baf70 (System.String) Length=5, String="Field"
[7] 000000040ef31928 (Newtonsoft.Json.Serialization.JsonProperty)
    _propertyName:000000050f0b7078 (System.String) Length=9, String="has_child"
    <UnderlyingName>k__BackingField:000000040ef309f0 (System.String) Length=5, String="Value"

Notice the last one. Underlying name is "Value", but the property name is "has_child". So this is what is causing the incorrectly generated requests. Where does that come from?

Interesting. Would you mind opening an issue on the github repo to discuss further?

Done: Issue 2881

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.