# FortKnox Data Protection Service (FK) Provides at least three core systems: - Central library which performs tokenization / redemption - Self-hosted HTTP+JSON endpoint for Tokenization / Redemption - (Optional) SQL proxy mode. SQL statements containing `TOKENIZE(…)` and `REDEEM(…)` will make the application strip out the values, perform the exchange on-server, and then pass upstream to SQL datastore containing only the replaced values (only tokenized data goes to/from the sql datastore). ## Requirements - Rust 1.74+ - git ## Token Specification ### Token Format Tokens are 128-bit (16 byte) UUIDs which are always base64 encoded using the URL-safe alphabet without padding. This means that every token is a 22-character string containing the alphabet: "`[a-z][A-Z][0-9]_-`" ### Namespacing Tokens MUST always be generated within a namespace. If a namespace is not provided, the request is rejected. ### Prefixing A prefix is an unsigned 16-bit value (Hex: 0x000-0xFFFF) as a means of uniquely identifying token sources. The 16-bit value has a maximum value of 65,536 (0-65535 inclusive). Prefixes MAY be set via a runtime configuration or defined in the datastore within a namespace. Once defined in the datastore, such prefixes MUST NOT be changed. ## API Specification Endpoints include: - `/ping` - liveness check - `/ready` - readiness check - `/api/` - API related documentation, including OpenAPI spec - `/health` - limited internal health data: backend DB type, latency to backend(s), cache usage, prefix (if enabled), signing pubkey (if enabled) - `/` - Tokenize or Redeem endpoint, split per deployment. ## Notes - System should be self-contained / self-hosting. Extra "parts" should be separable and/or unnecessary for normal functioning up to a certain limit. - Must leverage a sqlite datastore by default. - Connect to PostgreSQL, Oracle, or other provdiers via ODBC connector (?) ## Limitations If operating with a remote database, FK must not try to operate in a peering / cluster mode. ### Namespace limits Non-prefixed UUIDs will follow the UUIDv4-Variant1 specification in [RFC-4122](https://www.rfc-editor.org/rfc/rfc4122#section-4.4): 0 0 0 1 1 2 2 3 0 7 8 5 6 3 4 1 ----------------------------------- 000-031 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx 032-063 xxxxxxxx xxxxxxxx 0100xxxx xxxxxxxx 064-095 01xxxxxx xxxxxxxx xxxxxxxx xxxxxxxx 096-127 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx Prefixed Tokens will use a 16-bit identifier replacing the "random" least-significant bits of the time-low (clock) sequence. In order to support this and prevent compatibility problems with other UUID representations, FortKnox will [generate UUIDv8-based tokens](https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format#name-uuid-version-8). Assuming P is an identifier bit and using a zero-index count, the bit-specific structure would be as follows: 0 0 0 1 1 2 2 3 0 7 8 5 6 3 4 1 ----------------------------------- 000-031 PPPPPPPP PPPPPPPP xxxxxxxx xxxxxxxx 032-063 xxxxxxxx xxxxxxxx 1000xxxx xxxxxxxx 064-095 01xxxxxk xxxxxxxx xxxxxxxx xxxxxxxx 096-127 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx This leaves a maximum table space of `2^106` values or 8.1129638415e31 bits. At 100 bytes per associated token, this allows for billions of exabytes per regional namespace.