Understanding Hyper HTTP Clients (Part 1)
One of the under appreciated joys of open source is the ability to study someone else’s system design, to learn directly from working code. Today I wanted to try a new type of article, a walkthrough of interesting Rust codebases, starting with Hyper, a low level HTTP library.
Let’s build our understanding of Hyper from the bottom up, walking through some of the ’leaf’ modules and see how they fit into the call flow of a typical HTTP1 request response.
Overview
The journey a head!
------------ Public API ------------------
┌────────────┐┌──────────────┐
│Client::Conn││Client::Sender│
└┬───────────┘└──────────────┘
-┬--------- Private API ------------------
|
┌▽─────────────┐
│Dispatcher │
└┬────────────┬┘
┌▽──────────┐┌▽───────┐
│Conn ││Dispatch│
└┬─────────┬┘└────────┘
┌▽───────┐┌▽───┐
│Buffered││Role│ // Starting HERE!
└────────┘└────┘
Roles
Hyper defines x2 roles: Client
and Server
, which is an idea defined by the RFC as a specific type of participant within the HTTP dialogue.
#[cfg(feature = "client")]
pub(crate) enum Client {}
#[cfg(feature = "client")]
impl Http1Transaction for Client {
//....
}
#[cfg(feature = "server")]
pub(crate) enum Server {}
#[cfg(feature = "server")]
impl Http1Transaction for Server{
//....
}
More practically we can see the module defines two entities which implement the Http1Transaction
trait. The trait is a series of ‘static’ methods i.e. no self
receiver which specialise decision making depending on the participants role, with the trait’s default method implementation illustrating this quite nicely.
pub(crate) trait Http1Transaction {
// ...
fn should_error_on_parse_eof() -> bool {
Self::is_client()
}
fn should_read_first() -> bool {
Self::is_server()
}
}
The roles are used to fork behaviour in HTTP related functions based on participants. Remember there’s no inheritance in rust, so this is one way code can implement specialising behaviours.
// called by connections handling code for both client and server
pub(super) fn parse_headers<T>(
bytes: &mut BytesMut,
prev_len: Option<usize>,
ctx: ParseContext<'_>,
) -> ParseResult<T::Incoming>
where
T: Http1Transaction,
{
//....
T::parse(bytes, ctx)
}
Buffered
An internal struct which wraps a generic IO object and provides a polling interface for reading and writing HTTP messages with the help of some internal buffers.
The parse method is the equivalent of ‘socket.read()’ + parsing. It aims to accumulate enough bytes into the read buffer so it can parse the HTTP message, returning Poll:Ready when it’s finally got enough bytes to decipher the message. It also branches parsing logic based on role as we previously discussed.
impl<T, B> Buffered<T, B>
where
T: Read + Write + Unpin,
B: Buf,
{
// parse() reads IO and creates http messages out of buffer content
fn parse<S>(
&mut self,
cx: &mut Context<'_>,
parse_ctx: ParseContext<'_>,
) -> Poll<crate::Result<ParsedMessage<S::Incoming>>>
where
S: Http1Transaction,
{
// ...
}
// ...
Writing to the socket is accomplished via crate public getter method for a write buffer and an accompanying flush method that moves bytes over to the generic IO object via its async_write interface and ultimately calls flush(). Similar to parse, the methods return std::Task
so it plays nice with non blocking async.
pub(super) fn write_buf(&mut self) -> &mut WriteBuf<B> {
&mut self.write_buf
}
pub(crate) fn poll_flush(&mut self, cx: &mut Context<'_>) -> Poll<io::Result<()>> {
// ...
}
Conn(ection)
The connection struct is composed of / builds upon the Role
object ( type T) and the HTTP based buffer Buffered
and a new entity State
.
pub(crate) struct Conn<I, B, T> {
io: Buffered<I, EncodedBuf<B>>,
state: State,
_marker: PhantomData<fn(T)>,
}
The State
struct is file private and is a practical home for the possible state space of a HTTP connection. The connection uses the state struct to track the state of Buffered
and expose a series of boolean returning ‘check’ methods using the common ‘is’ or ‘can’ prefixes.
pub(crate) fn is_read_closed(&self) -> bool {
self.state.is_read_closed()
}
pub(crate) fn is_write_closed(&self) -> bool {
self.state.is_write_closed()
}
pub(crate) fn can_read_head(&self) -> bool {
// ...
!matches!(self.state.writing, Writing::Init)
}
The Connection module builds upon Buffered
IO and introduces reading and writing methods more aligned with the full life cycle of a HTTP request: reading bodies, reading with keepalive in mind etc.
pub(super) fn poll_read_head(
&mut self,
cx: &mut Context<'_>,
) -> Poll<Option<crate::Result<(MessageHead<T::Incoming>, DecodedLength, Wants)>>> {
// ...
}
pub(crate) fn poll_read_body(
&mut self,
cx: &mut Context<'_>,
) -> Poll<Option<io::Result<Frame<Bytes>>>> {
// ...
}
pub(crate) fn poll_read_keep_alive(&mut self, cx: &mut Context<'_>) -> Poll<crate::Result<()>> {
// ...
}
The Conn
API endeavours to provide the building blocks for interacting with a HTTP based connection but ultimately leaves the decisions / process to the upper layers. Also noteworthy, we’re still returning std::Task
. Here is a simplified example of what Conn
API consumer would look like:
// Client Send HTTP request example
if !conn.can_write_head() { return }
conn.write_head(head, body_type)?;
if !conn.can_write_body() { return }
conn.write_body(data)?;
conn.end_body()?
conn.poll_flush(cx)
The question for upper api layers is how to best ingest needed data and how to orchestrate the reading and writing of a connection, and so we start to see the usage of concurrency APIs ( Tokio ) and code made to service the public Hyper API.
Well after all that, we’re about half way up the stack, still to discover the main IO loop and how this all ties back to public API - all to come in part 2.
The idea of this article is kinda experimental, I have no idea if other people enjoy delving into other code bases and deciphering them, so please message me via socials if this has been of interest! And yes, enjoyment comes in many forms…