diff options
author | Manuel Alcaraz Zambrano <manuelalcarazzam@gmail.com> | 2019-12-21 19:30:58 +0100 |
---|---|---|
committer | Manuel Alcaraz Zambrano <manuelalcarazzam@gmail.com> | 2019-12-21 19:30:58 +0100 |
commit | 328f78b711d2ed44aed5a0002f8aade8a6f7c94b (patch) | |
tree | 6900c6ac506831787da2d487424508bef340179c | |
parent | 615f53d8bd0902034502eef16c4ebb224307ba1e (diff) | |
download | mediawikicore-328f78b711d2ed44aed5a0002f8aade8a6f7c94b.tar.gz mediawikicore-328f78b711d2ed44aed5a0002f8aade8a6f7c94b.zip |
docs: convert pageupdater and sitelist to markdown
Change-Id: I465831ad1fe0f05c32a44d2b47f3f19b78ba65d7
-rw-r--r-- | docs/pageupdater.md | 131 | ||||
-rw-r--r-- | docs/pageupdater.txt | 191 | ||||
-rw-r--r-- | docs/sitelist.md | 44 | ||||
-rw-r--r-- | docs/sitelist.txt | 47 |
4 files changed, 175 insertions, 238 deletions
diff --git a/docs/pageupdater.md b/docs/pageupdater.md new file mode 100644 index 000000000000..697d3565b2c1 --- /dev/null +++ b/docs/pageupdater.md @@ -0,0 +1,131 @@ +PageUpdater +=========== + +This document provides an overview of the usage of PageUpdater and DerivedPageDataUpdater. + +## `PageUpdater` +`PageUpdater` is the canonical way to create page revisions, that is, to perform edits. + +`PageUpdater` is a stateful, handle-like object that allows new revisions to be created on a given wiki page using the `saveRevision()` method. `PageUpdater` provides setters for defining the new revision's content as well as meta-data such as change tags. `saveRevision()` stores the new revision's primary content and metadata, and triggers the necessary updates to derived secondary data and cached artifacts e.g. in the `ParserCache` and the CDN layer, using a `DerivedPageDataUpdater`. + +`PageUpdater` instances follow the below life cycle, defined by a number of methods: + + +----------------------------+ + | | + | new | + | | + +------|--------------|------+ + | | + grabParentRevision()-| | + or hasEditConflict()-| | + | | + +--------v-------+ | + | | | + | parent known | | + | | | + Enables---------------+--------|-------+ | + safe operations based on | |-saveRevision() + the parent revision, e.g. | | + section replacement or | | + edit conflict resolution. | | + | | + saveRevision()-| | + | | + +------v--------------v------+ + | | + | creation committed | + | | + Enables-----------------+----------------------------+ + wasSuccess() + isUnchanged() + isNew() + getState() + getNewRevision() + etc. + +The stateful nature of `PageUpdater` allows it to be used to safely perform transformations that depend on the new revision's parent revision, such as replacing sections or applying 3-way conflict resolution, while protecting against race conditions using a compare-and-swap (CAS) mechanism: after calling code used the `grabParentRevision()` method to access the edit's logical parent, `PageUpdater` remembers that revision, and ensure that that revision is still the page's current revision when performing the atomic database update for the revision's primary meta-data when `saveRevision()` is called. If another revision was created concurrently, `saveRevision()` will fail, indicating the problem with the "edit-conflict" code in the status object. + +Typical usage for programmatic revision creation (with `$page` being a WikiPage as of 1.32, to be replaced by a repository service later): + + $updater = $page->newPageUpdater( $user ); + $updater->setContent( SlotRecord::MAIN, $content ); + $updater->setRcPatrolStatus( RecentChange::PRC_PATROLLED ); + $newRev = $updater->saveRevision( $comment ); + +Usage with content depending on the parent revision + + $updater = $page->newPageUpdater( $user ); + $parent = $updater->grabParentRevision(); + $content = $parent->getContent( SlotRecord::MAIN )->replaceSection( $section, $sectionContent ); + $updater->setContent( SlotRecord::MAIN, $content ); + $newRev = $updater->saveRevision( $comment, EDIT_UPDATE ); + +In both cases, all secondary updates will be triggered automatically. + +# `DerivedPageDataUpdater` +`DerivedPageDataUpdater` is a stateful, handle-like object that caches derived data representing a revision, and can trigger updates of cached copies of that data, e.g. in the links tables, `page_props`, the `ParserCache`, and the CDN layer. + +`DerivedPageDataUpdater` is used by `PageUpdater` when creating new revisions, but can also be used independently when performing meta data updates during undeletion, import, or when puring a page. It's a stepping stone on the way to a more complete refactoring of WikiPage. + +**NOTE**: Avoid direct usage of `DerivedPageDataUpdater`. In the future, we want to define interfaces for the different use cases of `DerivedPageDataUpdater`, particularly providing access to post-PST content and `ParserOutput` to callbacks during revision creation, which currently use `WikiPage::prepareContentForEdit`, and allowing updates to be triggered on purge, import, and undeletion, which currently use `WikiPage::doEditUpdates()` and `Content::getSecondaryDataUpdates()`. + +The primary reason for `DerivedPageDataUpdater` to be stateful is internal caching of state that avoids the re-generation of `ParserOutput` and re-application of pre-save-transformations (PST). + +`DerivedPageDataUpdater` instances follow the below life cycle, defined by a number of methods: + + +---------------------------------------------------------------------+ + | | + | new | + | | + +---------------|------------------|------------------|---------------+ + | | | + grabCurrentRevision()-| | | + | | | + +-----------v----------+ | | + | | |-prepareContent() | + | knows current | | | + | | | | + Enables------------------+-----|-----|----------+ | | + pageExisted() | | | | + wasRedirect() | |-prepareContent() | |-prepareUpdate() + | | | | + | | +-------------v------------+ | + | | | | | + | +----> has content | | + | | | | + Enables------------------------|----------+--------------------------+ | + isChange() | | | + isCreation() |-prepareUpdate() | | + getSlots() | prepareUpdate()-| | + getTouchedSlotRoles() | | | + getCanonicalParserOutput() | +-----------v------------v-----------------+ + | | | + +------------------> has revision | + | | + Enables-------------------------------------------+------------------------|-----------------+ + updateParserCache() | + runSecondaryDataUpdates() |-doUpdates() + | + +-----------v---------+ + | | + | updates done | + | | + +---------------------+ + + +- `grabCurrentRevision()` returns the logical parent revision of the target revision. It is guaranteed to always return the same revision for a given `DerivedPageDataUpdater` instance. If called before `prepareUpdate()`, this fixates the logical parent to be the page's current revision. If called for the first time after `prepareUpdate()`, it returns the revision passed as the 'oldrevision' option to `prepareUpdate()`, or, if that wasn't given, the parent of $revision parameter passed to `prepareUpdate()`. + +- `prepareContent()` is called before the new revision is created, to apply pre-save-transformation (PST) and allow subsequent access to the canonical `ParserOutput` of the revision. `getSlots()` and `getCanonicalParserOutput()` as well as `getSecondaryDataUpdates()` may be used after `prepareContent()` was called. Calling `prepareContent()` with the same parameters again has no effect. Calling it again with mismatching parameters, or calling it after `prepareUpdate()` was called, triggers a `LogicException`. + +- `prepareUpdate()` is called after the new revision has been created. This may happen right after the revision was created, on the same instance on which `prepareContent()` was called, or later (possibly much later), on a fresh instance in a different process, due to deferred or asynchronous updates, or during import, undeletion, purging, etc. `prepareUpdate()` is required before a call to `doUpdates()`, and it also enables calls to `getSlots()` and `getCanonicalParserOutput()` as well as `getSecondaryDataUpdates()`. Calling `prepareUpdate()` with the same parameters again has no effect. Calling it again with mismatching parameters, or calling it with parameters mismatching the ones `prepareContent()` was called with, triggers a `LogicException`. + +- `getSecondaryDataUpdates()` returns `DataUpdates` that represent derived data for the revision. These may be used to update such data, e.g. in `ApiPurge`, `RefreshLinksJob`, and the `refreshLinks` script. + +- `doUpdates()` triggers the updates defined by `getSecondaryDataUpdates()`, and also causes updates to cached artifacts in the `ParserCache`, the CDN layer, etc. This is primarily used by PageUpdater, but also by `PageArchive` during undeletion, and when importing revisions from XML. `doUpdates()` can only be called after `prepareUpdate()` was used to initialize the `DerivedPageDataUpdater` instance for a specific revision. Calling it before `prepareUpdate()` is called raises a `LogicException`. + +A `DerivedPageDataUpdater` instance is intended to be re-used during different stages of complex update operations that often involve callbacks to extension code via +MediaWiki's hook mechanism, or deferred or even asynchronous execution of Jobs and `DeferredUpdates`. Since these mechanisms typically do not provide a way to pass a +`DerivedPageDataUpdater` directly, `WikiPage::getDerivedPageDataUpdater()` has to be used to obtain a `DerivedPageDataUpdater` for the update currently in progress - re-using the same `DerivedPageDataUpdater` if possible avoids re-generation of `ParserOutput` objects +and other expensively derived artifacts. + +This mechanism for re-using a `DerivedPageDataUpdater` instance without passing it directly requires a way to ensure that a given `DerivedPageDataUpdater` instance can actually be used in the calling code's context. For this purpose, `WikiPage::getDerivedPageDataUpdater()` calls the `isReusableFor()` method on `DerivedPageDataUpdater`, which ensures that the given instance is applicable to the given parameters. In other words, `isReusableFor()` predicts whether calling `prepareContent()` or `prepareUpdate()` with a given set of parameters will trigger a `LogicException.` In that case, `WikiPage::getDerivedPageDataUpdater()` creates a fresh `DerivedPageDataUpdater` instance. diff --git a/docs/pageupdater.txt b/docs/pageupdater.txt deleted file mode 100644 index 3d113f65690a..000000000000 --- a/docs/pageupdater.txt +++ /dev/null @@ -1,191 +0,0 @@ -This document provides an overview of the usage of PageUpdater and DerivedPageDataUpdater. - -== PageUpdater == -PageUpdater is the canonical way to create page revisions, that is, to perform edits. - -PageUpdater is a stateful, handle-like object that allows new revisions to be created -on a given wiki page using the saveRevision() method. PageUpdater provides setters for -defining the new revision's content as well as meta-data such as change tags. saveRevision() -stores the new revision's primary content and metadata, and triggers the necessary -updates to derived secondary data and cached artifacts e.g. in the ParserCache and the -CDN layer, using a DerivedPageDataUpdater. - -PageUpdater instances follow the below life cycle, defined by a number of -methods: - - +----------------------------+ - | | - | new | - | | - +------|--------------|------+ - | | - grabParentRevision()-| | - or hasEditConflict()-| | - | | - +--------v-------+ | - | | | - | parent known | | - | | | - Enables---------------+--------|-------+ | - safe operations based on | |-saveRevision() - the parent revision, e.g. | | - section replacement or | | - edit conflict resolution. | | - | | - saveRevision()-| | - | | - +------v--------------v------+ - | | - | creation committed | - | | - Enables-----------------+----------------------------+ - wasSuccess() - isUnchanged() - isNew() - getState() - getNewRevision() - etc. - -The stateful nature of PageUpdater allows it to be used to safely perform -transformations that depend on the new revision's parent revision, such as replacing -sections or applying 3-way conflict resolution, while protecting against race -conditions using a compare-and-swap (CAS) mechanism: after calling code used the -grabParentRevision() method to access the edit's logical parent, PageUpdater -remembers that revision, and ensure that that revision is still the page's current -revision when performing the atomic database update for the revision's primary -meta-data when saveRevision() is called. If another revision was created concurrently, -saveRevision() will fail, indicating the problem with the "edit-conflict" code in the status -object. - -Typical usage for programmatic revision creation (with $page being a WikiPage as of 1.32, to be -replaced by a repository service later): - - $updater = $page->newPageUpdater( $user ); - $updater->setContent( SlotRecord::MAIN, $content ); - $updater->setRcPatrolStatus( RecentChange::PRC_PATROLLED ); - $newRev = $updater->saveRevision( $comment ); - -Usage with content depending on the parent revision - - $updater = $page->newPageUpdater( $user ); - $parent = $updater->grabParentRevision(); - $content = $parent->getContent( SlotRecord::MAIN )->replaceSection( $section, $sectionContent ); - $updater->setContent( SlotRecord::MAIN, $content ); - $newRev = $updater->saveRevision( $comment, EDIT_UPDATE ); - -In both cases, all secondary updates will be triggered automatically. - -== DerivedPageDataUpdater == -DerivedPageDataUpdater is a stateful, handle-like object that caches derived data representing -a revision, and can trigger updates of cached copies of that data, e.g. in the links tables, -page_props, the ParserCache, and the CDN layer. - -DerivedPageDataUpdater is used by PageUpdater when creating new revisions, but can also -be used independently when performing meta data updates during undeletion, import, or -when puring a page. It's a stepping stone on the way to a more complete refactoring of WikiPage. - -NOTE: Avoid direct usage of DerivedPageDataUpdater. In the future, we want to define interfaces -for the different use cases of DerivedPageDataUpdater, particularly providing access to post-PST -content and ParserOutput to callbacks during revision creation, which currently use -WikiPage::prepareContentForEdit, and allowing updates to be triggered on purge, import, and -undeletion, which currently use WikiPage::doEditUpdates() and Content::getSecondaryDataUpdates(). - -The primary reason for DerivedPageDataUpdater to be stateful is internal caching of state -that avoids the re-generation of ParserOutput and re-application of pre-save- -transformations (PST). - -DerivedPageDataUpdater instances follow the below life cycle, defined by a number of -methods: - - +---------------------------------------------------------------------+ - | | - | new | - | | - +---------------|------------------|------------------|---------------+ - | | | - grabCurrentRevision()-| | | - | | | - +-----------v----------+ | | - | | |-prepareContent() | - | knows current | | | - | | | | - Enables------------------+-----|-----|----------+ | | - pageExisted() | | | | - wasRedirect() | |-prepareContent() | |-prepareUpdate() - | | | | - | | +-------------v------------+ | - | | | | | - | +----> has content | | - | | | | - Enables------------------------|----------+--------------------------+ | - isChange() | | | - isCreation() |-prepareUpdate() | | - getSlots() | prepareUpdate()-| | - getTouchedSlotRoles() | | | - getCanonicalParserOutput() | +-----------v------------v-----------------+ - | | | - +------------------> has revision | - | | - Enables-------------------------------------------+------------------------|-----------------+ - updateParserCache() | - runSecondaryDataUpdates() |-doUpdates() - | - +-----------v---------+ - | | - | updates done | - | | - +---------------------+ - - -- grabCurrentRevision() returns the logical parent revision of the target revision. It is -guaranteed to always return the same revision for a given DerivedPageDataUpdater instance. -If called before prepareUpdate(), this fixates the logical parent to be the page's current -revision. If called for the first time after prepareUpdate(), it returns the revision -passed as the 'oldrevision' option to prepareUpdate(), or, if that wasn't given, the -parent of $revision parameter passed to prepareUpdate(). - -- prepareContent() is called before the new revision is created, to apply pre-save- -transformation (PST) and allow subsequent access to the canonical ParserOutput of the -revision. getSlots() and getCanonicalParserOutput() as well as getSecondaryDataUpdates() -may be used after prepareContent() was called. Calling prepareContent() with the same -parameters again has no effect. Calling it again with mismatching parameters, or calling -it after prepareUpdate() was called, triggers a LogicException. - -- prepareUpdate() is called after the new revision has been created. This may happen -right after the revision was created, on the same instance on which prepareContent() was -called, or later (possibly much later), on a fresh instance in a different process, -due to deferred or asynchronous updates, or during import, undeletion, purging, etc. -prepareUpdate() is required before a call to doUpdates(), and it also enables calls to -getSlots() and getCanonicalParserOutput() as well as getSecondaryDataUpdates(). -Calling prepareUpdate() with the same parameters again has no effect. -Calling it again with mismatching parameters, or calling it with parameters mismatching -the ones prepareContent() was called with, triggers a LogicException. - -- getSecondaryDataUpdates() returns DataUpdates that represent derived data for the revision. -These may be used to update such data, e.g. in ApiPurge, RefreshLinksJob, and the refreshLinks -script. - -- doUpdates() triggers the updates defined by getSecondaryDataUpdates(), and also causes -updates to cached artifacts in the ParserCache, the CDN layer, etc. This is primarily -used by PageUpdater, but also by PageArchive during undeletion, and when importing -revisions from XML. doUpdates() can only be called after prepareUpdate() was used to -initialize the DerivedPageDataUpdater instance for a specific revision. Calling it before -prepareUpdate() is called raises a LogicException. - -A DerivedPageDataUpdater instance is intended to be re-used during different stages -of complex update operations that often involve callbacks to extension code via -MediaWiki's hook mechanism, or deferred or even asynchronous execution of Jobs and -DeferredUpdates. Since these mechanisms typically do not provide a way to pass a -DerivedPageDataUpdater directly, WikiPage::getDerivedPageDataUpdater() has to be used to -obtain a DerivedPageDataUpdater for the update currently in progress - re-using the -same DerivedPageDataUpdater if possible avoids re-generation of ParserOutput objects -and other expensively derived artifacts. - -This mechanism for re-using a DerivedPageDataUpdater instance without passing it directly -requires a way to ensure that a given DerivedPageDataUpdater instance can actually be used -in the calling code's context. For this purpose, WikiPage::getDerivedPageDataUpdater() -calls the isReusableFor() method on DerivedPageDataUpdater, which ensures that the given -instance is applicable to the given parameters. In other words, isReusableFor() predicts -whether calling prepareContent() or prepareUpdate() with a given set of parameters will -trigger a LogicException. In that case, WikiPage::getDerivedPageDataUpdater() creates a -fresh DerivedPageDataUpdater instance. diff --git a/docs/sitelist.md b/docs/sitelist.md new file mode 100644 index 000000000000..b7d3ad4417bb --- /dev/null +++ b/docs/sitelist.md @@ -0,0 +1,44 @@ +Sitelist +======== + +This document describes the XML format used to represent information about external sites known to a MediaWiki installation. This information about external sites is used to allow "inter-wiki" links, cross-language navigation, as well as close integration via direct access to the other site's web API or even directly to their database. + +Lists of external sites can be imported and exported using the *importSites.php* and *exportSites.php* scripts. In the database, external sites are described by the `sites` and `site_ids` tables. + +The formal specification of the format used by *importSites.php* and *exportSites.php* can be found in the *sitelist-1.0.xsd* file. Below is an example and a brief description of what the individual XML elements and attributes mean: + + + <sites version="1.0"> + <site> + <globalid>acme.com</globalid> + <localid type="interwiki">acme</localid> + <group>Vendor</group> + <path type="link">http://acme.com/</path> + <source>meta.wikimedia.org</source> + </site> + <site type="mediawiki"> + <globalid>de.wikidik.example</globalid> + <localid type="equivalent">de</localid> + <group>Dictionary</group> + <forward/> + <path type="page_path">http://acme.com/</path> + </site> + </sites> + + +The XML elements are used as follows: + +- `sites`: The root element, containing a set of site tags. May have a `version` attribute with the value `1.0`. +- `site`: A site entry, representing an external website. May have a `type` attribute with one of the following values: + + `unknown`: (default) any website + + `mediawiki`: A MediaWiki site +- `globalid`: A unique identifier for the site. For a given site, the same unique global ID must be used across all wikis in a wiki farm (aka wiki family). +- `localid`: An identifier for the site, for use on the local wiki. Multiple local IDs may be assigned to a given site. The same local ID can be used to refer to different sites by different wikis on the same farm/family. The `localid` element may have a type attribute with one of the following values: + + `interwiki`: Used as an "interwiki" link prefix, for creating cross-wiki links. + + `equivalent`: Used as a "language" link prefix, for cross-linking equivalent content in different languages. +- `group`: The site group (e.g. wiki family) the site belongs to. +- `path`: A URL template for accessing resources on the site. Several paths may be defined for a given site, for accessing different kinds of resources, identified by the `type` attribute, using one of the following values: + + `link`: Generic URL template, often the document root. + + `page_path`: (for `mediawiki` sites) URL template for wiki pages (corresponds to the target wiki's `$wgArticlePath` setting) + + `file_path`: (for `mediawiki` sites) URL pattern for application entry points and resources (corresponds to the target wiki's `$wgScriptPath` setting). +- `forward`: Whether using a prefix defined by a `localid` tag in the URL will cause the request to be redirected to the corresponding page on the target wiki (currently unused). E.g. whether <http://wiki.acme.com/wiki/foo:Buzz> should be forwarded to <http://wiki.foo.com/read/Buzz>. (CAVEAT: not yet implement, can be specified but has no effect) diff --git a/docs/sitelist.txt b/docs/sitelist.txt deleted file mode 100644 index 24e1b9a7f58e..000000000000 --- a/docs/sitelist.txt +++ /dev/null @@ -1,47 +0,0 @@ -This document describes the XML format used to represent information about external sites known -to a MediaWiki installation. This information about external sites is used to allow "inter-wiki" -links, cross-language navigation, as well as close integration via direct access to the other -site's web API or even directly to their database. - -Lists of external sites can be imported and exported using the importSites.php and exportSites.php -scripts. In the database, external sites are described by the sites and site_ids tables. - -The formal specification of the format used by importSites.php and exportSites.php can be found in -the sitelist-1.0.xsd file. Below is an example and a brief description of what the individual XML -elements and attributes mean: - - - <sites version="1.0"> - <site> - <globalid>acme.com</globalid> - <localid type="interwiki">acme</localid> - <group>Vendor</group> - <path type="link">http://acme.com/</path> - <source>meta.wikimedia.org</source> - </site> - <site type="mediawiki"> - <globalid>de.wikidik.example</globalid> - <localid type="equivalent">de</localid> - <group>Dictionary</group> - <forward/> - <path type="page_path">http://acme.com/</path> - </site> - </sites> - - -The XML elements are used as follows: - -* sites: The root element, containing a set of site tags. May have a version attribute with the value 1.0. -* site: A site entry, representing an external website. May have a type attribute with one of the following values: -** ''unknown'': (default) any website -** ''mediawiki'': A MediaWiki site -* globalid: A unique identifier for the site. For a given site, the same unique global ID must be used across all wikis in a wiki farm (aka wiki family). -* localid: An identifier for the site, for use on the local wiki. Multiple local IDs may be assigned to a given site. The same local ID can be used to refer to different sites by different wikis on the same farm/family. The localid element may have a type attribute with one of the following values: -** interwiki: Used as an "interwiki" link prefix, for creating cross-wiki links. -** equivalent: Used as a "language" link prefix, for cross-linking equivalent content in different languages. -* group: The site group (e.g. wiki family) the site belongs to. -* path: A URL template for accessing resources on the site. Several paths may be defined for a given site, for accessing different kinds of resources, identified by the type attribute, using one of the following values: -** link: Generic URL template, often the document root. -** page_path: (for mediawiki sites) URL template for wiki pages (corresponds to the target wiki's $wgArticlePath setting) -** file_path: (for mediawiki sites) URL pattern for application entry points and resources (corresponds to the target wiki's $wgScriptPath setting). -* forward: Whether using a prefix defined by a localid tag in the URL will cause the request to be redirected to the corresponding page on the target wiki (currently unused). E.g. whether http://wiki.acme.com/wiki/foo:Buzz should be forwarded to http://wiki.foo.com/read/Buzz. (CAVEAT: not yet implement, can be specified but has no effect) |