rdbms: Remove "read-only primary" and restore "short cache" in lagged mode

== What * Trigger "lagged replica mode" even when all replicas are lagged. This partially undoes I53b261dfe (f59e4ad95e). This restores "shorten CDN cache" behaviour for third-party wikis. * Change "lagged replica mode" to no longer have the side-effect of making primary DBs read-only, under any circumstances. * Change "getLaggedReplicaMode" to no longer trigger a connection. If no data was queried, no stale data was served (mostly, there is potential via Memcached, but we don't measure that either way, and should be taken care of by rebound purges for production, though as explained below, this mode doesn't matter for production). == Why Follows-up I53b261dfe (f59e4ad95e), which changed lagged-replica mode to no longer be triggered when all replicas are lagged. This was motivated by the fact that 1. It is understood that at WMF it is impossible for all replicas to be lagged due to MySQL semi-sync, and 2. That this mode often triggers at WMF anyway due to how we measure lag, and 3. When this mode is activated it needlessly makes MW read-only. This has as downsides that it means lagged-replica mode no longer triggers for third-parties without semi-sync either, and thus the side-effect of shortening caches doesn't work, and the Skin footer message wouldn't be shown either. The problem we want to solve, which I think translates to both WMF and thid-parties, is that MediaWiki doesn't need to be responsible for turning the site read-only when a replica is lagged. For scales where that is a big problem, MySQL semi-sync can be used by the site admin, and smaller sites either don't have replicas or are small enough to not yet have this problem, and likely would be bottlenecked well before the DB level anyway. As such, solve T314975 differently by removing this activation of read-only mode instead, and preserve/restore the rest of it in simpler form. == Background The docs were very outdated, in particular the statement about 30 seconds. Historical references to that changing: * 2015: The MW default setting was lowered from 30s to 10s, with commit I56a7f35382 (453d88605b, T95501). * 2015: The WMF config for 'max lag' in $wgLBFactoryConf was gradually lowered from 30s to 6s, e.g. commit I7f71d75b (744722e784) and I02b1789095 (e8276e074f). * 2019: The MW default was lowered from 10s to 6s, with commit Ic2e82a8cc (e8276e074fa). Bug: T314975 Change-Id: Ie55aad42d99c71c54137c7c4138093082e561097
author: Timo Tijhof <krinkle@fastmail.com> 2023-01-17 19:07:39 +0000
committer: Krinkle <krinkle@fastmail.com> 2023-01-30 22:09:09 +0000
commit: 6365db67ba34aded441f28b8e00463c74e2f21d9 (patch)
tree: 32b66e7fbbada2104b770cfdf5f812db8d3611b9
parent: 71c014820f6414e06951cb76aa9b793b919846e2 (diff)
download: mediawikicore-6365db67ba34aded441f28b8e00463c74e2f21d9.tar.gz
mediawikicore-6365db67ba34aded441f28b8e00463c74e2f21d9.zip
2 files changed, 6 insertions, 36 deletions
diff --git a/docs/database.md b/docs/database.md
index 43fd61cf3cba..cf19a86ddba5 100644
--- a/docs/database.md
+++ b/docs/database.md
@@ -52,13 +52,13 @@ It's often the case that the best algorithm to use for a given task depends on w
 
 ## Lag
 
-Lag primarily occurs when large write queries are sent to the primary. Writes on the primary are executed in parallel, but they are executed in serial when they are replicated to the replicas. The primary writes the query to the binlog when the transaction is committed. The replicas poll the binlog and start executing the query as soon as it appears. They can service reads while they are performing a write query, but will not read anything more from the binlog and thus will perform no more writes. This means that if the write query runs for a long time, the replicas will lag behind the primary for the time it takes for the write query to complete.
+Lag primarily occurs when large write queries are sent to the primary. Writes are executed in parallel on the primary, but they are executed serially when replicated to the replicas. The primary database may not write its query to the binlog for replication, until after the transaction is committed. The replicas poll this binlog, and apply the query locally as soon as it appears there. They can respond to reads while they are applying the replicated writes, but will not read anything more from the binlog and thus will perform no more writes. This means that if the write query runs for a long time, the replicas will lag behind the primary for as long as it takes for the write query to complete.
 
-Lag can be exacerbated by high read load. MediaWiki's load balancer will stop sending reads to a replica when it is lagged by more than 30 seconds. If the load ratios are set incorrectly, or if there is too much load generally, this may lead to a replica permanently hovering around 30 seconds lag.
+Lag can be exacerbated by high read load. MediaWiki's LoadBalancer will avoid sending reads to a replica lagged by more than a few seconds.
 
-If all replicas are lagged by more than 30 seconds, MediaWiki will stop writing to the database. All edits and other write operations will be refused, with an error returned to the user. This gives the replicas a chance to catch up. Before we had this mechanism, the replicas would regularly lag by several minutes, making review of recent edits difficult.
+MediaWiki does its best for multiple queries during a given web request to represent a single consistent snapshot of the database at a given point in time. In addition to this, MediaWiki tries to ensure that a user sees the wiki change in chronological order, such as subsequent web requests see the same or newer data. In particular, it tries to ensure that the user's own actions are immediately reflected in subsequent requests. This is done by saving the primary's binlog position after a database write, and during subsequent connections to a replica it will wait as-needed to catch up to that position before sending any read queries.
 
-In addition to this, MediaWiki attempts to ensure that the user sees events occurring on the wiki in chronological order. A few seconds of lag can be tolerated, as long as the user sees a consistent picture from subsequent requests. This is done by saving the primary binlog position in the session, and then at the start of each request, waiting for the replica to catch up to that position before doing any reads from it. If this wait times out, reads are allowed anyway, but the request is considered to be in "lagged replica mode". Lagged replica mode can be checked by calling `LoadBalancer::getLaggedReplicaMode()`. The only practical consequence at present is a warning displayed in the page footer.
+If the wait for chronology protection times out, or more generally if a queried replica is lagged by more than 6 seconds (`LoadBalancer::MAX_LAG_DEFAULT`, configurable via `max lag` in `$wgLBFactoryConf`), the MediaWiki request is considered to be in "lagged replica mode" (`ILBFactory::laggedReplicaUsed`). In this mode, MediaWiki automatically shortens the expiry of object caching (via WANObjectCache) and HTTP/CDN caching to ensure that any stale data will soon converge.
 
 ## Lag avoidance
 
diff --git a/includes/libs/rdbms/loadbalancer/LoadBalancer.php b/includes/libs/rdbms/loadbalancer/LoadBalancer.php
index 680739a0431f..b598ea2c0fa4 100644
--- a/includes/libs/rdbms/loadbalancer/LoadBalancer.php
+++ b/includes/libs/rdbms/loadbalancer/LoadBalancer.php
@@ -602,6 +602,7 @@ class LoadBalancer implements ILoadBalancerForOwner {
 			}
 
 			if ( $i === false && count( $currentLoads ) ) {
+				$this->laggedReplicaMode = true;
 				// All replica DBs lagged, just pick anything.
 				$i = ArrayUtils::pickRandom( $currentLoads );
 			}
@@ -890,22 +891,7 @@ class LoadBalancer implements ILoadBalancerForOwner {
 		// The use of getServerConnection() instead of getConnection() avoids infinite loops.
 		$serverIndex = $this->getConnectionIndex( $i, $groups );
 		// Get an open connection to that server (might trigger a new connection)
-		$conn = $this->getServerConnection( $serverIndex, $domain, $flags );
-		// Set primary DB handles as read-only if there is high replication lag
-		if (
-			$conn &&
-			$serverIndex === $this->getWriterIndex() &&
-			$this->getLaggedReplicaMode() &&
-			!is_string( $conn->getLBInfo( $conn::LB_READ_ONLY_REASON ) )
-		) {
-			$genericIndex = $this->getExistingReaderIndex( self::GROUP_GENERIC );
-			$reason = ( $genericIndex !== self::READER_INDEX_NONE )
-				? 'The database is read-only until replication lag decreases.'
-				: 'The database is read-only until replica database servers becomes reachable.';
-			$conn->setLBInfo( $conn::LB_READ_ONLY_REASON, $reason );
-		}
-
-		return $conn;
+		return $this->getServerConnection( $serverIndex, $domain, $flags );
 	}
 
 	public function getServerConnection( $i, $domain, $flags = 0 ) {
@@ -1952,16 +1938,6 @@ class LoadBalancer implements ILoadBalancerForOwner {
 	}
 
 	public function getLaggedReplicaMode() {
-		if ( $this->laggedReplicaMode ) {
-			// Stay in lagged replica mode once it is observed on any domain
-			return true;
-		}
-
-		if ( $this->hasStreamingReplicaServers() ) {
-			// This will set "laggedReplicaMode" as needed
-			$this->getReaderIndex( self::GROUP_GENERIC );
-		}
-
 		return $this->laggedReplicaMode;
 	}
 
@@ -1974,12 +1950,6 @@ class LoadBalancer implements ILoadBalancerForOwner {
 			return $this->readOnlyReason;
 		} elseif ( $this->isPrimaryRunningReadOnly() ) {
 			return 'The primary database server is running in read-only mode.';
-		} elseif ( $this->getLaggedReplicaMode() ) {
-			$genericIndex = $this->getExistingReaderIndex( self::GROUP_GENERIC );
-
-			return ( $genericIndex !== self::READER_INDEX_NONE )
-				? 'The database is read-only until replication lag decreases.'
-				: 'The database is read-only until a replica database server becomes reachable.';
 		}
 
 		return false;
author	Timo Tijhof <krinkle@fastmail.com>	2023-01-17 19:07:39 +0000
committer	Krinkle <krinkle@fastmail.com>	2023-01-30 22:09:09 +0000
commit	6365db67ba34aded441f28b8e00463c74e2f21d9 (patch)
tree	32b66e7fbbada2104b770cfdf5f812db8d3611b9
parent	71c014820f6414e06951cb76aa9b793b919846e2 (diff)
download	mediawikicore-6365db67ba34aded441f28b8e00463c74e2f21d9.tar.gz mediawikicore-6365db67ba34aded441f28b8e00463c74e2f21d9.zip