Notes on HTTP/2

I recently had the opportunity to do a small presentation about HTTP/2. While preparing for this, I gathered lots of resources and dived into them to understand the protocol.

I am sharing my notes that were the basis of the presentation, and all the resources used, in the hopes that it is useful to some.

# Contents

# Background

  • HyperText Transfer Protocol, an application layer protocol
  • presumes an underlying and reliable transport layer protocol; Transmission Control Protocol (TCP) is commonly used.
  • Text-based (ASCII)

# Why?

Why is there a need for a new protocol? It’s partly because the way we use the web has changed: interactivity, more media, more content.

# A bloated web

# Issues with HTTP 1.1

  • Limited pipelining

Pipelining of requests results in a significant improvement, especially over high latency connections. It is less apparent on broadband connections, as the limitation of HTTP 1.1 still applies: the server must send its responses in the same order that the requests were received—so the entire connection remains first-in-first-out and HOL blocking can occur.

  • Head-of-line blocking
  • Bulky headers (User-Agent, Cookies) - cookies might be particularly long and are sent with every request
  • these drawbacks encouraged developers to use multiple TCP connections
  • even though HTTP 1.1 advised against this (source)
  • … and the slow-start mechanism of TCP makes this somewhat suboptimal

# h2 timeline

  • derived from SPDY (@ Google)
  • Internet Engineering Task Force presented it for Proposed Standard in 2014
  • Internet Engineering Steering Group approved it in early 2015
  • spec was published in May 2015

# Backwards compatibility

  • HTTP semantics (methods, status codes, headers) unaffected
  • applications can be unaware of the protocol they use
  • … unless they implement a web server
  • … or a custom client
  • … or want to optimise their site to the fullest
  • … more on optimisation later

# Top new features

  • binary framing layer
  • header compression
  • stream prioritisation
  • server push
  • flow control (similar to SSH)
  • (encryption)

# Binary framing layer

  • while h1 is text-based, h2 is binary
  • no need for multiple TCP connections per origin


The new binary framing layer enables full request and response multiplexing, by allowing the client and server to break down an HTTP message into independent frames, interleave them, and then reassemble them on the other end. HTTP/2 no longer needs multiple TCP connections to multiplex streams in parallel; each stream is split into many frames, which can be interleaved and prioritized. As a result, all HTTP/2 connections are persistent, and only one connection per origin is required, which offers numerous performance benefits.

# Multiplexing

Multiplexing of requests is achieved by having each HTTP request/response exchange associated with its own stream.

Note: Streams are largely independent of each other, so a blocked or stalled request or response does not prevent progress on other streams.

In fact, it introduces a ripple effect of numerous performance benefits across the entire stack of all web technologies, enabling us to:

  • Interleave multiple requests in parallel without blocking on any one
  • Interleave multiple responses in parallel without blocking on any one
  • Use a single connection to deliver multiple requests and responses in parallel
  • Remove unnecessary HTTP/1.x workarounds (see Optimizing for HTTP/1.x), such as concatenated files, image sprites, and domain sharding
  • Deliver lower page load times by eliminating unnecessary latency and improving utilization of available network capacity

# Stream Prioritization

  • A stream is an independent, bidirectional sequence of frames exchanged between the client and server
  • Each stream may be assigned an integer weight between 1 and 256
  • Each stream may be given an explicit dependency on another stream

Note: The combination of stream dependencies and weights allows the client to construct and communicate a “prioritization tree” that expresses how it would prefer to receive responses.

the combination of stream dependencies and weights provides an expressive language for resource prioritization, which is a critical feature for improving browsing performance where we have many resource types with different dependencies and weights

Once an HTTP message can be split into many individual frames, and we allow for frames from multiple streams to be multiplexed, the order in which the frames are interleaved and delivered both by the client and server becomes a critical performance consideration. To facilitate this, the HTTP/2 standard allows each stream to have an associated weight and dependency:

# Header compression


  • reduces the length of header field encodings by exploiting the redundancy inherent in protocols like HTTP.

Note: The ultimate goal of this is to reduce the amount of data that is required to send HTTP requests or responses.

# Why not GZIP?

GZIP compression of HTTP headers is vulnerable to CRIME attacks. In theory, CRIME can be used to obtain a secret cookie.

See this awesome proof of concept by quokkalight.

There is a possibility of information leakage that occurs when data is compressed prior to encryption: if someone can repeatedly inject arbitrary content with some sensitive and relatively predictable data, and observe the resulting encrypted stream, eventually then they will be able to extract the unknown data from it.

If they can observe network traffic, and manipulate the victim’s browser to submit requests to the target site, they can, as a result, steal the site’s cookies, and thus hijack the victim’s session. In the current form, the exploit uses JavaScript and needs 6 requests to extract one byte of data.


This is possible even over TLS, because while TLS provides confidentiality protection for content, it only provides a limited amount of protection for the length of that content.

With HPACK, for the attacker to find the value of a header, they must guess the entire value, instead of a gradual approach that was possible with DEFLATE matching, and was vulnerable to CRIME.

# Server Push

The server can push contents before the client even asked for them.

Note: Client requests and index.html Index.html contains links to stlye.css and script.js Server might push them to the client. this is done with a PUSH_PROMISE frame The client can opt out from the push

SETTINGS_ENABLE_PUSH can be set to 0 to turn off server push.

# Push cache

Unfortunately, even with perfect browser support you’ll have wasted bandwidth and server I/O before you get the cancel message. Cache digests aim to solve this, by telling the server in-advance what it has cached allow clients to. The client can inform the server of their cache’s contents. Servers can then use this to inform their choices of what to push to clients.


# Flow control

Receiver signals the sender for the maximal amount of data it is allowed to transmit (over a stream/TCP connection) This uses the WINDOW_UPDATE frame.

An interesting error code:


Sent when the peer might be generating excessive load.

# Encryption

  • it is not required by the protocol itself, but…
  • encryption is mandated by most implementations
  • currently no browser suports HTTP/2 unencrypted.

# Upgrade from h1

GET / HTTP/1.1
Connection: Upgrade, HTTP2-Settings
Upgrade: h2c
HTTP2-Settings: <base64url encoding of HTTP/2 SETTINGS payload>


A client that makes a request for an “http” URI without prior knowledge about support for HTTP/2 on the next hop uses the HTTP Upgrade mechanism […] by making an HTTP/1.1 request that includes an Upgrade header field with the “h2c” token. Such an HTTP/1.1 request MUST include exactly one HTTP2-Settings (Section 3.2.1) header field. SETTINGS_MAX_CONCURRENT_STREAMS, SETTINGS_ENABLE_PUSH, SETTINGS_MAX_FRAME_SIZE

# Optimising for h2

  • Concatenation
    • good practice, but with restrictions
  • Spriting
    • still relevant (
  • Domain sharding
    • mostly outdated (

Note: since it was quicker to download a single file as opposed to several smaller files, it was best practice to concatenate your site’s CSS into one large file, and your Javascript into another large file. In h2, requests are cheaper, so creating a large concatenated file is mostly not required and is even an anti-practice: The concatenated file would often contain components not required by the current page. For example, your blog page might load components that are only used on your checkout pages. If a single component changed then the entire concatenated file would need to be invalidated from the browser cache.

However, concatenation still does have it’s place due to compression ratios. Generally, larger files yield better compression results, thus reducing the total overall file size of your page. Although HTTP/2 requests are cheap you may see improved performance by concatenating modules logically, like so:

# H2 for APIs

  • X Server push X not useful in a REST API context
  • Because the HTTP/2 wire format is more efficient (in particular due to multiplexing and compression), REST APIs on top of HTTP/2 will also benefit of this.
  • H2 can use one connection for parallelism, without head of line blocking.


the downside of HTTP/2’s network friendliness is that it makes TCP congestion control more noticeable; now that browsers only use one connection per host, the initial window and packet losses are a lot more apparent.

# In Action

curl -I -v

* Connected to ( port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1


* SSL connection using TLSv1.2 / ECDHE-ECDSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2

# Vulnerabilities

  • Slow read
  • HPACK bomb
  • Depencency cycle attack
  • Stream multiplexing abuse

# Slow read

  • malicious client reads responses very slowly
  • server will allocate resources to the stream (usually one thread per stream)
    • As long as the attacker sends WINDOW_UPDATE frames, the thread is kept alive


While the attacker, in the original setting had to open as many TCP connections as the victim server, in the HTTP/2 setting the attacks becomes simpler, since the attacker can use the stream multiplexing capabilities to multiplex a large number of streams over a single TCP connection. Although the server maintains a single TCP connection, it dedicates a thread per stream and thus the result of the attack is consumption of all the worker threads of the victim server.

# HPACK bomb

  • an attack in which the attacker generates a first stream with one large header–as big as the entire table.
  • Then they repeatedly open new streams on the same connection that reference this single large header as many times as possible.
  • The server keeps allocating new memory to decompress the requests and eventually consumes all memory, denying further access by other clients

Note: The default size of the dynamic table is 4KB. The server allows one request to contain up to 16K of header references. By sending a single header of size 4KB and then sending a request with 16K references to this one header, the request is decompressed to 64MB on the server side. As we open more streams on the same connection, we quickly consume more and more memory as shown in Figures 24, 25, and 26. In our lab, 14 streams that consumed 896MB after decompression, were enough to crash the server


# The future

  • QUIC:

Note: Quick UDP Internet Connections It is an experimental protocol over UDP.

# Sources

Written on March 3, 2019

If you notice anything wrong with this post (factual error, rude tone, bad grammar, typo, etc.), and you feel like giving feedback, please do so by contacting me at Thank you!