1 files changed, 309 insertions, 0 deletions
diff --git a/posts/2025/systems-self-defense.md b/posts/2025/systems-self-defense.md
new file mode 100644
index 0000000..25d1e50
--- /dev/null
+++ b/posts/2025/systems-self-defense.md
@@ -0,0 +1,309 @@
+# Systems and Self-Defense
+
+_Published 2025-04-11_
+
+Despite what the first Avengers movie would tell us, a system can
+protect itself _from itself_ if built intentionally. To understand how,
+let's start with the following:
+
+**Tyler’s Law**:
+> “Any system will inevitably be used to 100% of its authorized capacity.”
+
+**Tyler's Corollary**:
+> "If your authorized capacity is equal to your available capacity, your
+> system will fail."
+
+"Authorized" is a term different from "available" capacity and they are
+not interchangeable.
+
+## Denial of Service (DoS)
+
+Authorized capacity for a computer is not generally controllable by the
+end user (unless you've got `root` access), so a single process can
+consume as much in resources as the computer has available (with very
+few limits). If one runs a command to duplicate a movie file of "Plan 9
+from Outer Space" fifty thousand times, like
+
+```seq 50000 | xargs -I{} cp plan9.mov plan9copy{}.mov```
+
+the computer will dutifully use all its resources to accomplish that
+objective until the job completes or the disk is full. 
+
+Note, there is _no interactivity_ once executed -- the command accepts one
+set of instructions, stops taking input, then executes without
+indications of progress until the job is done.
+
+A service, on the other hand, accepts input from another source and
+_persists._ As the service does work on a job, it may be coded to accept
+more inputs, and reply to the requestor with already-completed work.
+
+This poses a problem for managing resources: how many resources are to
+be used for in-flight operations? Does the service have enough resources
+to accept new work while processing a current job? How does the service
+tell the requestor it's not ready yet?
+
+If the system is using all of its _available_ resources to do work, then
+there is no resource left to respond to a client/user, to process added
+[`signals`][9] (like `kill`/`term`), or even provide telemetry to an
+observer.
+
+If a service or a computer "goes silent," how are we sure it is
+functioning correctly, if at all?
+
+## The Problem
+
+The key point to the previous section is this:
+
+>If we are able to give a service enough work that it uses all of its
+>available resources, then we've achieved a [Denial of Service][1]
+>condition.
+
+This is bad. To allow a program to "cancel" an erroneous command or be
+triggered to produce telemetry / feedback, the program must be able to
+listen for [`signals`][9] from the operating system and act accordingly.
+The only exception is [`SIGKILL`][2], which cannot be blocked or
+handled.
+
+The goal of the operating system / kernel is to ensure that "authorized
+resources" never exceed "avialable resources", or *the system will
+crash*. This is why `SIGKILL` is unblockable -- it's an action of last
+resort by the operating system (OS) to protect itself.
+
+But what if we have an interactive _service_? We don't want to terminate
+the process if it gets stuck -- we want it to keep running. So how do we
+protect it?
+
+## Self-defense
+
+All programs and services practice a form of self-defense known as
+response-codes or error-codes. They provide signals to an operator or
+requestor that vary from "I'm still here" to "Please try again later"
+or "this broke something" or even "your request is broken and I won't do
+it."
+
+What does this look like in practice though?
+
+### Service Example
+
+Let's say I have a web server that accepts text, appends it to a file,
+and returns a line number to the client.
+
+```
+(Step 1) ClientA  --  "foo"  --> <Service> --> [write to disk]
+(Step 2) ClientA <--   "1"   --  <Service>
+```
+
+Because this service has to write values sequentially, while it is
+performing work for Client-A, it cannot do anything else. This means
+that if Client-A gives sufficient work to Service, it can't do anything
+else.
+
+So let's look at what happens when we introduce Client-B:
+
+```
+(Step 1) ClientA  --   "foo"   --> <Service> --> [write to disk]
+(Step 2) ClientB  --   "bar"   --> <Service> --> [BLOCKED]
+(Step 3) ClientB <-- "TIMEOUT" --  <Service>
+(Step 4) ClientA <--    "1"    --  <Service> --  [write completes]
+```
+
+Because this service has to write values sequentially, while it is
+performing work for Client-A, if Client-B wants send "bar" to the
+service, we must tell Client-B to wait or come back later.
+
+By default, a TCP connection to a service will `CONNECT`, send data, and
+then receive a response. The operating system can multiplex TCP
+`CONNECT` requests, so if one is already active, it will tell the
+underlying network hardware to "wait" until it can open up another port.
+
+Your application, however, can't see this "wait" condition -- it just
+goes silent until either the kernel accepts your connection, or you
+time-out and the kernel evicts you from the queue.
+
+This is not a great solution. What could we do instead?
+
+### Do Nothing?
+
+No really, what if we do nothing?
+
+That's a pretty great situation for the developer -- zero work needed
+and the kernel/OS does the multiplexing. This, however, is a _horrible_
+experience for clients and service operators. Before specific solutions
+like running a dedicated HTTP service or providing an http stack
+in-process, the answer to making a service network-available was using
+[inetd][10]. It was (and still is) incredibly slow and does not scale
+beyond very low traffic rates. 
+
+So, in doing nothing, the service appears inconsistent, with
+periodically high latency, and does not actually fix the problem (see
+"Denial of Service" above).
+
+### Application Layer Defense
+
+Most network services operate on HTTP, even ones that [bind to unix
+sockets][5]. Even [grpc operates on http2][6], so I think it's safe to
+say I can leverage HTTP status codes as an example of how to respond to
+a client without requiring _too much_ translation to other stacks. 
+
+When an HTTP server is "busy", [rfc6585][8] suggests responding with a
+[code `429`][4] which maps to "Too Many Requests."
+
+Of note is this paragraph:
+
+>Note that this specification does not define how the origin server
+>identifies the user, nor how it counts requests.  For example, an
+>origin server that is limiting request rates can do so based upon
+>counts of requests on a per-resource basis, across the entire server,
+>or even among a set of servers.  Likewise, it might identify the user
+>by its authentication credentials, or a stateful cookie.
+
+Let's refer back to our Server example. If we have Client-A handling a
+request, then any *additional* requests during the time Client-A's
+request is being processed would be "too many requests" for the server
+to handle.
+
+So what we should so is configure the server to do two things at once
+(or, at least, two things concurrently). As pseudocode:
+
+```
+var locked bool
+
+fn writeLineToFile(s string, f file) -> (success bool, l int) {
+  locked = true;
+  n, err = os.Write(s, f);
+  locked = false;
+
+  if err != nil{
+    return false, 0; // We failed
+  }
+  return true, n;
+}
+
+main(){
+  f = open("filename")
+  requests = http.Listen(port)
+
+  for r in requests {
+    if locked {
+      r.reply(http-429) // Too Many Requests
+    } else {
+      ok, line = writeLineToFile(r.payload, f)
+      if ok {
+        r.reply(line)
+      } else {
+        r.reply(http-500) // Error
+      }
+    }
+}
+```
+
+Let's take a second to look at this -- the main code opens a file to
+persist the data and listens for requests on the HTTP port. All seems
+normal, until we get to the `if locked` section. If the lock is active,
+immediately stop what you're doing and reply with a [HTTP-429][4].
+(For those using GRPC, you can swap "HTTP-429" out for metadata [status
+code][7] `UNAVAILABLE(14)`.)
+
+You'll notice that we haven't done any additional code checking -- no
+computation, no querying the file, just check the boolean value, then
+act.
+
+This is, computationally speaking, very cheap to do. The benefit of the
+positioning also means that we short-circuit the operation _before we
+even start down the computationally-expensive path_ of accepting the
+payload and doing anything with the file.
+
+Looking further into the `writeLineToFile` function, you'll also see
+that we "lock" the file for the minimum part of the operation -- the
+part where we actually write to the file. We don't lock the
+error-checking portion of the code and we don't lock the reply to the
+client. This means that while the code is doing
+things-not-writing-to-files, we can go as fast and as concurrently as we
+want.
+
+If we increase our activity to three clients (A,B,C), our new operation
+graph looks something like this:
+
+```
+(Step 1)  ClientA  --  "foo"   --> <Service> --> [write to disk]
+(Step 2)  ClientB  --  "bar"   --> <Service> --> [BLOCKED]
+(Step 3)  ClientC  --  "quux"  --> <Service> --> [BLOCKED]
+(Step 4)  ClientB <-- "E: 429" --  <Service>
+(Step 5)  ClientC <-- "E: 429" --  <Service>
+(Step 6)  ClientA <--   "1"    --  <Service> --  [write completes]
+(Step 7)  ClientC  --  "quux"  --> <Service> --> [write to disk]
+(Step 8)  ClientB  --  "bar"   --> <Service> --> [BLOCKED]
+(Step 9)  ClientB <-- "E: 429" --  <Service>
+(Step 10) ClientC <--   "2"    --  <Service> --  [write completes]
+```
+
+You'll notice at Step-7, ClientC tried again, faster than ClientB, and
+they won the race! As we reply to clients, they can decide when to try
+again, and since ClientA's request was fulfilled in Step-6 the lock was
+lifted, allowing ClientC to make a write.
+
+Regardless of who wins the race to the next write, the service can only
+handle one active request at a time, and **it protects itself** from
+being forced to handle additional work by telling clients to try again.
+
+## "Authorized Capacity" != "Available Capacity"
+
+The service described previously has an "Authorized Capacity" of "1
+in-flight request." If the computer running this service was a
+single-core tiny computer, it might not have enough power to serve
+additional requests (even the HTTP-429 replies) but most computing
+devices will have either sufficient speed (allownig for concurrent
+processing) or additional cores to handle concurrent requests. As a
+result, authorized capacity is less than the total available capacity of
+the server.
+
+When running an unmodified, uncontained process in a computer, the
+process is "authorized" to use _nearly_ the entire set of resources
+available to the OS.
+
+In a cloud environment, a process can run inside a virtual machine (VM)
+where the constraints are similar to a bare-metal computer, but (with
+few exceptions) are executing within even higher capacity bare-metal
+hardware. _This means the VM is authorized to operate in a
+higher-availability environment._
+
+If running in a container or bsd-jail, the constraints are set by the
+container manager (sometimes just a command line argument!) then the
+kernel grants (and constrains) those resources to the process inside.
+
+If a process exceeds its authorized limits in any of these environments,
+the supervisor process (OS, container manager, VM, etc) will forcibly
+end the offending process ("kill"/"terminate") and reclaim its
+resources. It doesn't matter whether or not **additional resources
+exist** that the process could use -- the supervisory system will act.
+
+## Conclusions
+
+All of this leads to a few heuristics when defining the runtime
+environment for an application: 
+
+- Determine the maximum resource usage for a given process, then add
+  excess capacity to ensure the system has more than it needs by an
+  appreciable margin. A heuristic is a minimum of 25-15% at lower values, and as
+  little as 5% for larger environments (to avoid wasting significant
+  capacity).
+- Stress-test / Load-test your application to its "maximum" throughput
+  for the resources you want to use, then set your maximum in-flight to
+  90% of that value. This will ensure that your instance is always
+  capable of processing the maximum in-flight within limits.
+
+Both of these will ensure your *authorized* resources (for the
+application, your container, and your process) will never exceed your
+*available* capacity and keep your service functioning at maximum
+throughput in the face of overwhelming requests.
+
+[1]:https://en.wikipedia.org/wiki/Denial-of-service_attack
+[2]:https://www.gnu.org/software/libc/manual/html_node/Termination-Signals.html
+[3]:https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/413
+[4]:https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/429
+[5]:https://aweirdimagination.net/2024/04/07/http-over-unix-sockets/
+[6]:https://grpc.io/blog/grpc-on-http2/
+[7]:https://grpc.io/docs/guides/status-codes/
+[8]:https://www.rfc-editor.org/rfc/rfc6585#section-4
+[9]:https://www.man7.org/linux/man-pages/man7/signal.7.html
+[10]:https://en.wikipedia.org/wiki/Inetd