diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..a136337
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1 @@
+*.pdf
diff --git a/README.markdown b/README.markdown
index 445b1d7..8dab1be 100644
--- a/README.markdown
+++ b/README.markdown
@@ -9,13 +9,13 @@ lecture and discussion. Participants will gain an intuitive understanding of
key distributed systems terms, an overview of the algorithmic landscape, and
explore production concerns.
-## What makes a thing distributed?
+## What makes a thing distributed
Lamport, 1987:
-> A distributed system is one in which the failure of a computer
-> you didn't even know existed can render your own computer
-> unusable.
+> A distributed system is one in which the failure of a computer
+> you didn't even know existed can render your own computer
+> unusable.
- First glance: \*nix boxen in our colo, running processes communicating via
TCP or UDP.
@@ -162,7 +162,7 @@ Lamport, 1987:
- This causes all kinds of havoc in, say, metrics collection
- And debugging it is *hard*
- TCP gives you flow control and repacks logical messages into packets
- - You'll need to re-build flow-control and backpressure
+ - You'll need to re-build flow-control and back-pressure
- TLS over UDP is a thing, but tough
- UDP is really useful where TCP FSM overhead is prohibitive
- Memory pressure
@@ -187,15 +187,15 @@ Lamport, 1987:
- Caveat: Hardware can drift
- Caveat: By *centuries*
- NTP might not care
- - http://rachelbythebay.com/w/2017/09/27/2153/
+ -
- Caveat: NTP can still jump the clock backwards (default: delta > 128 ms)
- - https://www.eecis.udel.edu/~mills/ntp/html/clock.html
+ -
- Caveat: POSIX time is not monotonic by *definition*
- Cloudflare 2017: Leap second at midnight UTC meant time flowed backwards
- At the time, Go didn't offer access to CLOCK_MONOTONIC
- Computed a negative duration, then fed it to rand.int63n(), which paniced
- Caused DNS resolutions to fail: 1% of HTTP requests affected for several hours
- - https://blog.cloudflare.com/how-and-why-the-leap-second-affected-cloudflare-dns/
+ -
- Caveat: The timescales you want to measure may not be attainable
- Caveat: Threads can sleep
- Caveat: Runtimes can sleep
@@ -252,7 +252,6 @@ Lamport, 1987:
- I don't know who's doing it yet, but I'd bet datacenters in the
future will offer dedicated HW interfaces for bounded-accuracy time.
-
## Review
We've covered the fundamental primitives of distributed systems. Nodes
@@ -261,9 +260,6 @@ various ways. Protocols like TCP and UDP give us primitive channels for
processes to communicate, and we can order events using clocks. Now, we'll
discuss some high-level *properties* of distributed systems.
-
-
-
## Availability
- Availability is basically the fraction of attempted operations which succeed.
@@ -309,7 +305,6 @@ discuss some high-level *properties* of distributed systems.
- "Apdex for the user service just dropped to 0.5; page ops!"
- Ideally: integral of happiness delivered by your service?
-
## Consistency
- A consistency model is the set of "safe" histories of events in the system
@@ -465,7 +460,6 @@ discuss some high-level *properties* of distributed systems.
- Monotonic Reads
- Monotonic Writes
-
### Harvest and Yield
- Fox & Brewer, 1999: Harvest, Yield, and Scalable Tolerant Systems
@@ -482,8 +476,8 @@ discuss some high-level *properties* of distributed systems.
- e.g. "99% of the time, you can read 90% of your prior writes"
- Strongly dependent on workload, HW, topology, etc
- Can tune harvest vs yield on a per-request basis
- - "As much as possible in 10ms, please"
- - "I need everything, and I understand you might not be able to answer"
+ - "As much as possible in 10ms, please"
+ - "I need everything, and I understand you might not be able to answer"
### Hybrid systems
@@ -505,7 +499,6 @@ consistency models generally come at the cost of performance and availability.
Next, we'll talk about different ways to build systems, from weak to strong
consistency.
-
## Avoid Consensus Wherever Possible
### CALM conjecture
@@ -534,7 +527,6 @@ consistency.
- Unordered programming with flow analysis
- Can tell you where coordination *would* be required
-
### Gossip
- Message broadcast system
@@ -552,7 +544,7 @@ consistency.
- Hop up to a connector node which relays to other connector nodes
- Reduces superfluous messages
- Reduces latency
- - Plumtree (Leit ̃ao, Pereira, & Rodrigues, 2007: Epidemic Broadcast Trees)
+ - Plumtree (Leitao, Pereira, & Rodrigues, 2007: Epidemic Broadcast Trees)
- Push-Sum et al
- Sum inputs from everyone you've received data from
- Broadcast that to a random peer
@@ -603,7 +595,6 @@ consistency.
- Probably best in concert with stronger transactional systems
- See also: COPS, Swift, Eiger, Calvin, etc
-
## Fine, We Need Consensus, What Now?
- The consensus problem:
@@ -648,7 +639,6 @@ consistency.
majority of nodes.
- More during cluster transitions.
-
### Paxos
- Paxos is the Gold Standard of consensus algorithms
@@ -740,8 +730,6 @@ transactions. Serializability and linearizability require *consensus*, which we
can obtain through Paxos, ZAB, VR, or Raft. Now, we'll talk about different
*scales* of distributed systems.
-
-
## Characteristic latencies
- Latency is *never* zero
@@ -791,12 +779,11 @@ can obtain through Paxos, ZAB, VR, or Raft. Now, we'll talk about different
- Network is within an order of mag compared to uncached disk seeks
- Or faster, in EC2
- EC2 disk latencies can routinely hit 20ms
- - 200ms?
- - *20,000* ms???
- - Because EBS is actually other computers
- - LMAO if you think anything in EC2 is real
- - Wait, *real disks do this too*?
- - What even are IO schedulers?
+ - 200ms? *20,000* ms???
+ - Because EBS is actually other computers
+ - LMAO if you think anything in EC2 is real
+ - Wait, *real disks do this too*?
+ - What even are IO schedulers?
- But network is waaaay slower than memory/computation
- If your aim is *throughput*, work units should probably take longer than a
millisecond
@@ -849,7 +836,6 @@ latencies are short enough for many network hops before users take notice. In
geographically replicated systems, high latencies drive eventually consistent
and datacenter-pinned solutions.
-
## Common distributed systems
### Outsourced heaps
@@ -951,7 +937,6 @@ low-latency processing of datasets, and tend to look more like frameworks than
databases. Their dual, distributed queues, focus on the *messages* rather
than the *transformations*.
-
## A Pattern Language
- General recommendations for building distributed systems
@@ -1093,7 +1078,7 @@ than the *transformations*.
- Sharding for scalability
- Avoiding coordination via CRDTs
- Flake IDs: *mostly* time-ordered identifiers, zero-coordination
- - See http://yellerapp.com/posts/2015-02-09-flake-ids.html
+ - See
- Partial availability: users can still use some parts of the system
- Processing a queue: more consumers reduces the impact of expensive events
@@ -1290,8 +1275,6 @@ scale. As software grows, different components must scale independently,
and we break out libraries into distinct services. Service structure goes
hand-in-hand with teams.
-
-
## Production Concerns
- More than design considerations
@@ -1362,8 +1345,8 @@ hand-in-hand with teams.
- In relation to its dependencies
- Which can, in turn, drive new tests
- In a way, good monitoring is like continuous testing
- - But not a replacement: these are distinct domains
- - Both provide assurance that your changes are OK
+ - But not a replacement: these are distinct domains
+ - Both provide assurance that your changes are OK
- Want high-frequency monitoring
- Production behaviors can take place on 1ms scales
- TCP incast
@@ -1381,13 +1364,13 @@ hand-in-hand with teams.
- Key metrics for most systems
- Apdex: successful response WITHIN latency SLA
- Latency profiles: 0, 0.5, 0.95, 0.99, 1
- - Percentiles, not means
- - BTW you can't take the mean of percentiles either
+ - ^ Percentiles, not means
+ - ^ BTW you can't take the mean of percentiles either
- Overall throughput
- Queue statistics
- Subjective experience of other systems latency/throughput
- - The DB might think it's healthy, but clients could see it as slow
- - Combinatorial explosion--best to use this when drilling into a failure
+ - ^ The DB might think it's healthy, but clients could see it as slow
+ - ^ Combinatorial explosion--best to use this when drilling into a failure
- You probably have to write this instrumentation yourself
- Invest in a metrics library
- Out-of-the-box monitoring usually doesn't measure what really matters: your
@@ -1504,7 +1487,6 @@ hand-in-hand with teams.
- Ask Jeff Hodges why it's hard: see his RICON West 2013 talk
- See Zach Tellman - Everything Will Flow
-
## Review
Running distributed systems requires cooperation between developers, QA, and
@@ -1519,10 +1501,15 @@ special care.
### Online
-- Mixu has a delightful book on distributed systems with incredible detail. http://book.mixu.net/distsys/
-- Jeff Hodges has some excellent, production-focused advice. https://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/
-- The Fallacies of Distributed Computing is a classic text on mistaken assumptions we make designing distributed systems. http://www.rgoarchitects.com/Files/fallacies.pdf
-- Christopher Meiklejohn has a list of key papers in distributed systems. http://christophermeiklejohn.com/distributed/systems/2013/07/12/readings-in-distributed-systems.html
+- Mixu has a delightful book on distributed systems with incredible detail.
+
+- Jeff Hodges has some excellent, production-focused advice.
+
+- The Fallacies of Distributed Computing is a classic text
+ on mistaken assumptions we make designing distributed systems.
+
+- Christopher Meiklejohn has a list of key papers in distributed systems.
+
### Trees
diff --git a/pandoc_pdf.sh b/pandoc_pdf.sh
new file mode 100755
index 0000000..3287602
--- /dev/null
+++ b/pandoc_pdf.sh
@@ -0,0 +1,3 @@
+#!/bin/sh
+
+pandoc --verbose --from=markdown_github --output=aphyr-distsys-intro.pdf --variable classoption=twocolumn --standalone README.markdown