FortKnox Data Protection Service (FK)
Provides at least three core systems:
- Central library which performs tokenization / redemption
- Self-hosted HTTP+JSON endpoint for Tokenization / Redemption
- (Optional) SQL proxy mode.
SQL statements containing
TOKENIZE(…)
andREDEEM(…)
will make the application strip out the values, perform the exchange on-server, and then pass upstream to SQL datastore containing only the replaced values (only tokenized data goes to/from the sql datastore).
Requirements
- Rust 1.74+
- git
Token Specification
Token Format
Tokens are 128-bit (16 byte) UUIDs which are always base64 encoded using the URL-safe alphabet without padding.
This means that every token is a 22-character string containing the alphabet: "[a-z][A-Z][0-9]_-
"
Namespacing
Tokens MUST always be generated within a namespace. If a namespace is not provided, the request is rejected.
Prefixing
A prefix is an unsigned 16-bit value (Hex: 0x000-0xFFFF) as a means of uniquely identifying token sources. The 16-bit value has a maximum value of 65,536 (0-65535 inclusive).
Prefixes MAY be set via a runtime configuration or defined in the datastore within a namespace. Once defined in the datastore, such prefixes MUST NOT be changed.
API Specification
Endpoints include:
/ping
- liveness check/ready
- readiness check/api/
- API related documentation, including OpenAPI spec/health
- limited internal health data: backend DB type, latency to backend(s), cache usage, prefix (if enabled), signing pubkey (if enabled)/
- Tokenize or Redeem endpoint, split per deployment.
Notes
- System should be self-contained / self-hosting. Extra "parts" should be separable and/or unnecessary for normal functioning up to a certain limit.
- Must leverage a sqlite datastore by default.
- Connect to PostgreSQL, Oracle, or other provdiers via ODBC connector (?)
Limitations
If operating with a remote database, FK must not try to operate in a peering / cluster mode.
Namespace limits
Non-prefixed UUIDs will follow the UUIDv4-Variant1 specification in RFC-4122:
0 0 0 1 1 2 2 3
0 7 8 5 6 3 4 1
-----------------------------------
000-031 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
032-063 xxxxxxxx xxxxxxxx 0100xxxx xxxxxxxx
064-095 01xxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
096-127 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
Prefixed Tokens will use a 16-bit identifier replacing the "random" least-significant bits of the time-low (clock) sequence. In order to support this and prevent compatibility problems with other UUID representations, FortKnox will generate UUIDv8-based tokens.
Assuming P is an identifier bit and using a zero-index count, the bit-specific structure would be as follows:
0 0 0 1 1 2 2 3
0 7 8 5 6 3 4 1
-----------------------------------
000-031 PPPPPPPP PPPPPPPP xxxxxxxx xxxxxxxx
032-063 xxxxxxxx xxxxxxxx 1000xxxx xxxxxxxx
064-095 01xxxxxk xxxxxxxx xxxxxxxx xxxxxxxx
096-127 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
This leaves a maximum table space of 2^106
values or 8.1129638415e31 bits. At 100 bytes per associated token, this allows for billions of exabytes per regional namespace.