summaryrefslogtreecommitdiffstats
path: root/posts/2016/redefining-stability.md
blob: da921ffb2c93bb92fc635ba3260d96c284687786 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# Redefining Stability

> Published on 2016-08-31

Defining stability is a hard problem due to many factors. I propose the use of
some guidelines and modifiers to describe various "stable" systems to better
describe their true state.

## Common Points

- The software runs within its defined memory/cpu parameters
- The software does not crash under "normal" load

## Understable

- There is some risk in using the service. Not all edge cases for the API have
  been tested/bugfixed
- Deployments need monitoring by "that one person" who understands the service
  deeply
- Subject to large or frequent code / API updates
- Using "new" or untested libraries, repeatedly updated (subject to frequent
  deployments)

## Stable

- Most edge-cases have been fixed for the API
- Engineering teams maintain and update the code regularly (as needed or
  monthly)
- Relatively few updates to the business-logic code. The API is versioned and/or
  tested for backwards compatibility
- Libraries are updated and the app is refactored to keep pace with security
  updates and Engineering progress outside the app (e.g. database version
  changes)

## Overstable

- So "rock solid" that no one wants to touch it -- it has "hairs" on it in the
  form of bugfixes and all edge-cases have been handled in code.
- Code is reviewed / updated only when absolutely necessary to prevent
  catastrophic failures. Library and code updates are avoided.
- Would be considered "abandonware" in other contexts
- (Like *understable*) requires "that one person" who understands the code to
  maintain it

## Conclusions and Reasoning

Both under- and over-stable systems are in a "bad" state and are something we
should avoid.

Having truly Stable code means regularly reviewing or refactoring code to
account for new Engineering requirements or practices, fixing or adding pieces
as needed, and keeping the team's understanding of the code fresh.