diff options
Diffstat (limited to 'docs/datastores.md')
-rw-r--r-- | docs/datastores.md | 89 |
1 files changed, 89 insertions, 0 deletions
diff --git a/docs/datastores.md b/docs/datastores.md index e69de29..ef22c72 100644 --- a/docs/datastores.md +++ b/docs/datastores.md @@ -0,0 +1,89 @@ +# Data Stores + +## Scope + +This standard prescribes database and data storage technologies used to solve many related data-retention concerns. +The solutions recommended below are designed to encourage deep expertise in a few stable and well-understood systems, rather than maximal "fit" for each distinct use case. + +As such, the solutions may not be the most optimal but their performance, maintenance, optimizations, and reliability requirements are understood and supported by the engineering community. + +## Terms + +- _Database_: provides long-term, durable storage for data whose loss or unavailability would mean violating an application's Availability or Business requirements. +- _Cache_: provides short-term, volatile storage which does not preserve data. + Caches are not in the scope of this standard. + +## Capability Matrix + +| Capabilities | RDBMS | KV Store | File/Object | +|--------------|-------|----------|-------------| +| [Relational] | X | O | | +| [Key-Value] | X | X | X | +| [Document-Oriented] | X | X | X | +| [Object-Based] | X | X | X | + +_O_: While KV-stores cannot store relational data, some KV-focused databases provide relational-like "tagging" and other attribute aggregations. + +## Selection Criteria + +### PostgreSQL + +Applications SHOULD use PostgreSQL (Aurora in Cloud environments and the latest stable release in OnPrem environments.) + +PostgreSQL across all environments supports all storage methods including: + +- Simple [Key-Value] stores (via [hstore]) +- [Document-Oriented] storage and queries (via [jsonb]) +- [Large-object] storage directly within the database + +If the application is running in the Cloud environment and needs a total data size over [64TB (RDS)][1] or [128TB (Aurora)][2], then it MUST use another approved option. + +### DynamoDB + +If an Application is hosted in AWS and requires many of: + +- Flexible, schemaless data model that will change in the future +- Read-heavy access model for items +- Single-millisecond response times +- Very fast caching for hot keys and values (< 10ms) +- Multi-region deployments + +and does not require any of: + +- Strongly consistent reads and writes in all situations +- Item/Row sizes over 400 KB +- A maximum data size of over 1TB +- Joins between different stored values + +then it MAY use DynamoDB. + +### S3 / File store + +If the Applications persists data with many of the following: + +- Large files (>100MB each) +- Large volumes of data (>1 million records) +- Simple identification requirements (e.g., "filename as key") +- Simple relational requirements (e.g., folders and files) + +and does not require any of the following: + +- Low-latency access (< 200ms) +- Relational join operations +- Low response times (< 1000ms) + +then it MAY use a filestore or S3. + +[Aurora (PostgreSQL)]: https://aws.amazon.com/rds/aurora/postgresql-features/ +[RDS (PostgreSQL)]: https://aws.amazon.com/rds/postgresql/ +[DynamoDB]: https://aws.amazon.com/dynamodb/ +[S3]: https://aws.amazon.com/s3/ +[Relational]:https://en.wikipedia.org/wiki/Relational_database +[Key-Value]: https://en.wikipedia.org/wiki/Key-value_database +[Document-Oriented]: https://en.wikipedia.org/wiki/Document-oriented_database +[Object-Based]: https://en.wikipedia.org/wiki/Object_storage#Cloud_storage +[hstore]: https://www.postgresql.org/docs/13/hstore.html +[jsonb]: https://www.postgresql.org/docs/current/datatype-json.html +[Large-object]: https://www.postgresql.org/docs/13/largeobjects.html +[1]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html +[2]: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/CHAP_Limits.html |