HTTP request and RFC

Hi,

the HTTP 1.1 spec defines

Once an inbound connection is obtained, the client sends an HTTP
request message (Section 3) with a request-target derived from the
target URI. There are four distinct formats for the request-target,
depending on both the method being requested and whether the request
is to a proxy.

request-target = origin-form
                / absolute-form
                / authority-form
                / asterisk-form

and in

To allow for transition to the absolute-form for all requests in some
future version of HTTP, a server MUST accept the absolute-form in
requests, even though HTTP/1.1 clients will only send them in
requests to proxies.

But when I send the following HTTP request to Elasticsearch

POST http://localhost:9200/_search HTTP/1.1
host: localhost:9200
date: Fri, 28 Apr 2017 19:30:04 GMT
content-length: 42
content-type: application/json

{"query":{"match":{"_all":"Hello World"}}}

for example with a command like

cat http-uri-test.txt | nc localhost 9200

the following response is returned

HTTP/1.1 400 Bad Request
content-type: text/plain; charset=UTF-8
content-length: 74

No handler found for uri [http://localhost:9200/_search] and method [POST]

I think it would be nice for Elasticsearch to support the full HTTP spec. Should I open an issue?

Yes please :smiley:

I concur, this is a spec violation, and I spent some time looking into what would be involved in fixing it and it has me left wondering whether or not we should indeed fix this.

To be clear, a client should not be sending a request with a request target in absolute form like this.

Yet, it's easy to add code that handles receiving a request in this form, the complexity arises when validating such a request. What you don't want is this:

GET http://www.example.com:80/_cluster/state?pretty=true HTTP/1.1

sent to an Elasticsearch node and have it route this to the REST cluster state handler because Elasticsearch is not listening on port 80 on any interface with an address that www.example.com resolves to. However, this then means doing an arbitrary DNS lookup here to find out if the request is valid or not and that's a giant no-no. So I'm left thinking whether or not we really need to do this, with maybe the only thing being actually flat-out rejecting requests in this form (yes, I know this would be violating the spec but I'm okay with that in some cases and this might be such a case).

I will solicit thoughts from others.

I agree with Jason that supporting the absolute-form could expose us to attack vectors like DOS on a slow DNS lookup. What we could do is to compare the hostname with the value of the host header and reject anything that is different. I'm not sure that would really be spec compliant though...

And to what end?

RFC 7230 - Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing says:

When making a request directly to an origin server, other than a
CONNECT or server-wide OPTIONS request (as detailed below), a client
MUST send only the absolute path and query components of the target
URI as the request-target.

And yes, RFC 7230 - Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing says

To allow for transition to the absolute-form for all requests in some
future version of HTTP, a server MUST accept the absolute-form in
requests, even though HTTP/1.1 clients will only send them in
requests to proxies.

but we already have a future version of HTTP which changes the protocol dramatically in a different direction.

We'd essentially be adding code that is executed on every request for a potential future change that, in all probability, will never occur.

1 Like

Thanks @Clinton_Gormley, I think we will not proceed with changing anything here then.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.