I've discovered an interesting problem when searching using the python Elasticsearch DSL client on Google AppEngine.
Google AppEngine does not work with urllib3 and instead uses a patched version of requests that proxies to an internal request API. I can index data just fine using the
RequestsHttpConnection connection class.
However, whenever I perform a search, I get garbage results (in that the results make no sense). I used the output of
my_search_request.to_dict() through cURL and got the expected result. Upon further investigation, I noticed that Google App Engine strips the
Content-Length: header on the request as a security precaution. When I attempted the same query through cURL only without the
Content-Length header I get the same bad result.
To confirm, I setup a reverse proxy in front of my local ElasticSearch instance and and set the connection in my code to point to it. Sure enough, the user agent was rewritten by Google and there was no Content-Length header.
Are there any known workarounds that can be done from the client (like send an additional parameter)? Google does offer a native socket interface that is in Beta, but I would prefer to use the standard HTTP fetch api that google provides.
EDIT: It looks like URLFetch in AppEngine does weird stuff with Content-Length when a GET request has a body.