Once an inbound connection is obtained, the client sends an HTTP
request message (Section 3) with a request-target derived from the
target URI. There are four distinct formats for the request-target,
depending on both the method being requested and whether the request
is to a proxy.
request-target = origin-form
/ absolute-form
/ authority-form
/ asterisk-form
and in
To allow for transition to the absolute-form for all requests in some
future version of HTTP, a server MUST accept the absolute-form in
requests, even though HTTP/1.1 clients will only send them in
requests to proxies.
But when I send the following HTTP request to Elasticsearch
HTTP/1.1 400 Bad Request
content-type: text/plain; charset=UTF-8
content-length: 74
No handler found for uri [http://localhost:9200/_search] and method [POST]
I think it would be nice for Elasticsearch to support the full HTTP spec. Should I open an issue?
I concur, this is a spec violation, and I spent some time looking into what would be involved in fixing it and it has me left wondering whether or not we should indeed fix this.
To be clear, a client should not be sending a request with a request target in absolute form like this.
Yet, it's easy to add code that handles receiving a request in this form, the complexity arises when validating such a request. What you don't want is this:
GET http://www.example.com:80/_cluster/state?pretty=true HTTP/1.1
sent to an Elasticsearch node and have it route this to the REST cluster state handler because Elasticsearch is not listening on port 80 on any interface with an address that www.example.com resolves to. However, this then means doing an arbitrary DNS lookup here to find out if the request is valid or not and that's a giant no-no. So I'm left thinking whether or not we really need to do this, with maybe the only thing being actually flat-out rejecting requests in this form (yes, I know this would be violating the spec but I'm okay with that in some cases and this might be such a case).
I agree with Jason that supporting the absolute-form could expose us to attack vectors like DOS on a slow DNS lookup. What we could do is to compare the hostname with the value of the host header and reject anything that is different. I'm not sure that would really be spec compliant though...
When making a request directly to an origin server, other than a
CONNECT or server-wide OPTIONS request (as detailed below), a client
MUST send only the absolute path and query components of the target
URI as the request-target.
To allow for transition to the absolute-form for all requests in some
future version of HTTP, a server MUST accept the absolute-form in
requests, even though HTTP/1.1 clients will only send them in
requests to proxies.
but we already have a future version of HTTP which changes the protocol dramatically in a different direction.
We'd essentially be adding code that is executed on every request for a potential future change that, in all probability, will never occur.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.