Notes on HTTP/2
I recently had the opportunity to do a small presentation about HTTP/2. While preparing for this, I gathered lots of resources and dived into them to understand the protocol.
I am sharing my notes that were the basis of the presentation, and all the resources used, in the hopes that it is useful to some.
- h2 timeline
- Backwards compatibility
- Top new features
- Upgrade from h1
- Optimising for h2
- H2 for APIs
- The future
- HyperText Transfer Protocol, an application layer protocol
- presumes an underlying and reliable transport layer protocol; Transmission Control Protocol (TCP) is commonly used.
- Text-based (ASCII)
Why is there a need for a new protocol? It’s partly because the way we use the web has changed: interactivity, more media, more content.
# A bloated web
- Compare the homepage of the FBI in 1996
- …or the MIT in 1997
- to the site of the FBI today: fbi.gov
- The median page size is around 1.8 MB, and is getting larger.
- On average, over 100 individual resources are required to display each page.
- Stats about the size of web pages:
# Issues with HTTP 1.1
- Limited pipelining
Pipelining of requests results in a significant improvement, especially over high latency connections. It is less apparent on broadband connections, as the limitation of HTTP 1.1 still applies: the server must send its responses in the same order that the requests were received—so the entire connection remains first-in-first-out and HOL blocking can occur.
- Head-of-line blocking
- Bulky headers (User-Agent, Cookies) - cookies might be particularly long and are sent with every request
- these drawbacks encouraged developers to use multiple TCP connections
- even though HTTP 1.1 advised against this (source)
- … and the slow-start mechanism of TCP makes this somewhat suboptimal
# h2 timeline
- derived from SPDY (@ Google)
- Internet Engineering Task Force presented it for Proposed Standard in 2014
- Internet Engineering Steering Group approved it in early 2015
- spec was published in May 2015
# Backwards compatibility
- HTTP semantics (methods, status codes, headers) unaffected
- applications can be unaware of the protocol they use
- … unless they implement a web server
- … or a custom client
- … or want to optimise their site to the fullest
- … more on optimisation later
# Top new features
- binary framing layer
- header compression
- stream prioritisation
- server push
- flow control (similar to SSH)
# Binary framing layer
- while h1 is text-based, h2 is binary
- no need for multiple TCP connections per origin
The new binary framing layer enables full request and response multiplexing, by allowing the client and server to break down an HTTP message into independent frames, interleave them, and then reassemble them on the other end. HTTP/2 no longer needs multiple TCP connections to multiplex streams in parallel; each stream is split into many frames, which can be interleaved and prioritized. As a result, all HTTP/2 connections are persistent, and only one connection per origin is required, which offers numerous performance benefits.
Multiplexing of requests is achieved by having each HTTP request/response exchange associated with its own stream.
Note: Streams are largely independent of each other, so a blocked or stalled request or response does not prevent progress on other streams.
In fact, it introduces a ripple effect of numerous performance benefits across the entire stack of all web technologies, enabling us to:
- Interleave multiple requests in parallel without blocking on any one
- Interleave multiple responses in parallel without blocking on any one
- Use a single connection to deliver multiple requests and responses in parallel
- Remove unnecessary HTTP/1.x workarounds (see Optimizing for HTTP/1.x), such as concatenated files, image sprites, and domain sharding
- Deliver lower page load times by eliminating unnecessary latency and improving utilization of available network capacity
# Stream Prioritization
- A stream is an independent, bidirectional sequence of frames exchanged between the client and server
- Each stream may be assigned an integer weight between 1 and 256
- Each stream may be given an explicit dependency on another stream
Note: The combination of stream dependencies and weights allows the client to construct and communicate a “prioritization tree” that expresses how it would prefer to receive responses.
the combination of stream dependencies and weights provides an expressive language for resource prioritization, which is a critical feature for improving browsing performance where we have many resource types with different dependencies and weights
Once an HTTP message can be split into many individual frames, and we allow for frames from multiple streams to be multiplexed, the order in which the frames are interleaved and delivered both by the client and server becomes a critical performance consideration. To facilitate this, the HTTP/2 standard allows each stream to have an associated weight and dependency:
# Header compression
reduces the length of header field encodings by exploiting the redundancy inherent in protocols like HTTP.
Note: The ultimate goal of this is to reduce the amount of data that is required to send HTTP requests or responses.
# Why not GZIP?
GZIP compression of HTTP headers is vulnerable to CRIME attacks. In theory, CRIME can be used to obtain a secret cookie.
See this awesome proof of concept by quokkalight.
There is a possibility of information leakage that occurs when data is compressed prior to encryption: if someone can repeatedly inject arbitrary content with some sensitive and relatively predictable data, and observe the resulting encrypted stream, eventually then they will be able to extract the unknown data from it.
This is possible even over TLS, because while TLS provides confidentiality protection for content, it only provides a limited amount of protection for the length of that content.
With HPACK, for the attacker to find the value of a header, they must guess the entire value, instead of a gradual approach that was possible with DEFLATE matching, and was vulnerable to CRIME.
# Server Push
The server can push contents before the client even asked for them.
Client requests and index.html
Index.html contains links to stlye.css and script.js
Server might push them to the client.
this is done with a
The client can opt out from the push
SETTINGS_ENABLE_PUSH can be set to 0 to turn off server push.
# Push cache
Unfortunately, even with perfect browser support you’ll have wasted bandwidth and server I/O before you get the cancel message. Cache digests aim to solve this, by telling the server in-advance what it has cached allow clients to. The client can inform the server of their cache’s contents. Servers can then use this to inform their choices of what to push to clients.
# Flow control
Receiver signals the sender for the maximal amount of data it is allowed to transmit (over a stream/TCP connection)
This uses the
An interesting error code:
Sent when the peer might be generating excessive load.
- it is not required by the protocol itself, but…
- encryption is mandated by most implementations
- currently no browser suports HTTP/2 unencrypted.
# Upgrade from h1
GET / HTTP/1.1 Host: server.example.com Connection: Upgrade, HTTP2-Settings Upgrade: h2c HTTP2-Settings: <base64url encoding of HTTP/2 SETTINGS payload>
A client that makes a request for an “http” URI without prior knowledge about support for HTTP/2 on the next hop uses the HTTP Upgrade mechanism […] by making an HTTP/1.1 request that includes an Upgrade header field with the “h2c” token. Such an HTTP/1.1 request MUST include exactly one HTTP2-Settings (Section 3.2.1) header field. SETTINGS_MAX_CONCURRENT_STREAMS, SETTINGS_ENABLE_PUSH, SETTINGS_MAX_FRAME_SIZE
# Optimising for h2
- good practice, but with restrictions
still relevant (https://blog.octo.com/en/http2-arrives-but-sprite-sets-aint-no-dead/)
- Domain sharding
mostly outdated (https://bunnycdn.com/blog/domain-sharding-might-actually-be-hurting-your-performance/)
However, concatenation still does have it’s place due to compression ratios. Generally, larger files yield better compression results, thus reducing the total overall file size of your page. Although HTTP/2 requests are cheap you may see improved performance by concatenating modules logically, like so:
# H2 for APIs
- X Server push X not useful in a REST API context
- Because the HTTP/2 wire format is more efficient (in particular due to multiplexing and compression), REST APIs on top of HTTP/2 will also benefit of this.
- H2 can use one connection for parallelism, without head of line blocking.
the downside of HTTP/2’s network friendliness is that it makes TCP congestion control more noticeable; now that browsers only use one connection per host, the initial window and packet losses are a lot more apparent. https://dzone.com/articles/benefits-of-rest-apis-with-http2
# In Action
curl -I -v https://en.wikipedia.org/wiki/Peppa_Pig
* Connected to en.wikipedia.org (184.108.40.206) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 ... * SSL connection using TLSv1.2 / ECDHE-ECDSA-CHACHA20-POLY1305 * ALPN, server accepted to use h2
- Slow read
- HPACK bomb
- Depencency cycle attack
- Stream multiplexing abuse
# Slow read
- malicious client reads responses very slowly
- server will allocate resources to the stream (usually one thread per stream)
- As long as the attacker sends WINDOW_UPDATE frames, the thread is kept alive
While the attacker, in the original setting had to open as many TCP connections as the victim server, in the HTTP/2 setting the attacks becomes simpler, since the attacker can use the stream multiplexing capabilities to multiplex a large number of streams over a single TCP connection. Although the server maintains a single TCP connection, it dedicates a thread per stream and thus the result of the attack is consumption of all the worker threads of the victim server.
# HPACK bomb
- an attack in which the attacker generates a first stream with one large header–as big as the entire table.
- Then they repeatedly open new streams on the same connection that reference this single large header as many times as possible.
- The server keeps allocating new memory to decompress the requests and eventually consumes all memory, denying further access by other clients
Note: The default size of the dynamic table is 4KB. The server allows one request to contain up to 16K of header references. By sending a single header of size 4KB and then sending a request with 16K references to this one header, the request is decompressed to 64MB on the server side. As we open more streams on the same connection, we quickly consume more and more memory as shown in Figures 24, 25, and 26. In our lab, 14 streams that consumed 896MB after decompression, were enough to crash the server
Solution is SETTINGS_MAX_HEADER_LIST_SIZE
# The future
- QUIC: https://en.wikipedia.org/wiki/QUIC
Note: Quick UDP Internet Connections It is an experimental protocol over UDP.
- HTTP/2 RFC
- HPACK RFC
- HTTP/2 explained by Daniel Steinberg
- Google Developers post
- High Performance Browser Networking by O’Reilly
- HTTP/2 Demo and
- its explanation by Javier Garza @ Akamai
- Some pros and cons summarised by Upwork
- HTTP/2 usage trend
- h2 stats on keycdn
- A great vulnerability report by Imperva
- Great article about server push and push cache: jakearchibald.com/2017/h2-push-tougher-than-i-thought/
- How to optimise for HTTP/2
- Discussion around optimising for h2 on Hacker News
- The average page size is a myth