From ec2d0953101fa98a8290feb1dac5f15d9c8179df Mon Sep 17 00:00:00 2001 From: Amir Sarabadani Date: Fri, 14 Feb 2025 19:31:56 +0100 Subject: objectcache: Introduce dataRedundancy to SqlBagOStuff This forces SqlBagOStuff to instead of sharding keys, write to n out of m servers instead. It also reads from those servers as well and in case of incosistency, picks the value with the highest exptime. This is mostly for mainstash and allows us to provide stronger consistency guarantees while allowing for a section to be depooled and put to maintenance. It basically implements the logic already used by NoSQL database systems such as Cassandra (There are two types to solve conflict, quorum or timestamp, Cassandra is using quorum while we are using timestamp). There will be some edge cases that it might still pick the wrong value: - if TTL is set to INDEF - if the TTL gets shortened for various reasons. - If we go with two clusters, value is set, one gets depooled, a new value is set, the depooled one gets pooled and the other depooled and then read happens. But all of these are extremely rare edge cases and we should be fine. This also means if data redundancy is set, locking means all sections will be locked and removal means all sections must allow the unblock. Otherwise, the lock will be kept. Bug: T383327 Change-Id: I80da12396858ee4fc58ae257f6c154b3050df696 --- docs/config-schema.yaml | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'docs') diff --git a/docs/config-schema.yaml b/docs/config-schema.yaml index 2f5ef0c6da5d..437ee899c6e8 100644 --- a/docs/config-schema.yaml +++ b/docs/config-schema.yaml @@ -2487,6 +2487,14 @@ config-schema: key hashing. This helps mitigate MySQL bugs 61735 and 61736. - writeBatchSize: Default maximum number of rows to change in each query for write operations that can be chunked into a set of smaller writes. + - dataRedundancy: When set to a number higher than one, instead of sharding values, + it writes to that many servers (out of all servers) and reads from all of them too. + In case of inconsistency between servers, it picks the value with the highest exptime. + Mostly useful for stronger consistency such as mainstash. + This option has many limitations (for example when TTL is set to indef or changes) + and it shouldn't be used to handle race conditions nor canonical data. + The main point of data redundancy is to allow depool of a cluster for maintenance + without displacing too many keys. For MemcachedPhpBagOStuff parameters see {@link MemcachedPhpBagOStuff::__construct} For MemcachedPeclBagOStuff parameters see {@link MemcachedPeclBagOStuff::__construct} For RedisBagOStuff parameters see {@link Wikimedia\ObjectCache\RedisBagOStuff::__construct} -- cgit v1.2.3