aboutsummaryrefslogtreecommitdiffstats

FortKnox Data Protection Service (FK)

Provides at least three core systems:

  • Central library which performs tokenization / redemption
  • Self-hosted HTTP+JSON endpoint for Tokenization / Redemption
  • (Optional) SQL proxy mode. SQL statements containing TOKENIZE(…) and REDEEM(…) will make the application strip out the values, perform the exchange on-server, and then pass upstream to SQL datastore containing only the replaced values (only tokenized data goes to/from the sql datastore).

Requirements

  • Rust 1.74+
  • git

Token Specification

Token Format

Tokens are 128-bit (16 byte) UUIDs which are always base64 encoded using the URL-safe alphabet without padding. This means that every token is a 22-character string containing the alphabet: "[a-z][A-Z][0-9]_-"

Namespacing

Tokens MUST always be generated within a namespace. If a namespace is not provided, the request is rejected.

Prefixing

A prefix is an unsigned 16-bit value (Hex: 0x000-0xFFFF) as a means of uniquely identifying token sources. The 16-bit value has a maximum value of 65,536 (0-65535 inclusive).

Prefixes MAY be set via a runtime configuration or defined in the datastore within a namespace. Once defined in the datastore, such prefixes MUST NOT be changed.

API Specification

Endpoints include:

  • /ping - liveness check
  • /ready - readiness check
  • /api/ - API related documentation, including OpenAPI spec
  • /health - limited internal health data: backend DB type, latency to backend(s), cache usage, prefix (if enabled), signing pubkey (if enabled)
  • / - Tokenize or Redeem endpoint, split per deployment.

Notes

  • System should be self-contained / self-hosting. Extra "parts" should be separable and/or unnecessary for normal functioning up to a certain limit.
  • Must leverage a sqlite datastore by default.
  • Connect to PostgreSQL, Oracle, or other provdiers via ODBC connector (?)

Limitations

If operating with a remote database, FK must not try to operate in a peering / cluster mode.

Namespace limits

Non-prefixed UUIDs will follow the UUIDv4-Variant1 specification in RFC-4122:

            0      0 0      1 1      2 2      3
            0      7 8      5 6      3 4      1
            -----------------------------------
 000-031    xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx 
 032-063    xxxxxxxx xxxxxxxx 0100xxxx xxxxxxxx
 064-095    01xxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
 096-127    xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx

Prefixed Tokens will use a 16-bit identifier replacing the "random" least-significant bits of the time-low (clock) sequence. In order to support this and prevent compatibility problems with other UUID representations, FortKnox will generate UUIDv8-based tokens.

Assuming P is an identifier bit and using a zero-index count, the bit-specific structure would be as follows:

            0      0 0      1 1      2 2      3
            0      7 8      5 6      3 4      1
            -----------------------------------
 000-031    PPPPPPPP PPPPPPPP xxxxxxxx xxxxxxxx 
 032-063    xxxxxxxx xxxxxxxx 1000xxxx xxxxxxxx
 064-095    01xxxxxk xxxxxxxx xxxxxxxx xxxxxxxx
 096-127    xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx

This leaves a maximum table space of 2^106 values or 8.1129638415e31 bits. At 100 bytes per associated token, this allows for billions of exabytes per regional namespace.