aboutsummaryrefslogtreecommitdiffstats
path: root/maintenance/refreshLinks.php
Commit message (Collapse)AuthorAgeFilesLines
* maintenance: Use more of namespaced Maintenance classReedy2024-10-161-0/+1
| | | | Change-Id: I53f2e32c73c92cc3a0deee48ebe6d13329a7a0cf
* Specify caller in DB queriesBartosz Dziewoński2024-09-111-1/+1
| | | | | | Found warnings about this in WMF production logs. Change-Id: I3ba973c320d672604c0c0ffa1c229a32231261b9
* Exclude boilerplate maintenance code from code coverage reportsDreamy Jazz2024-08-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Why: * Maintenance scripts in core have bolierplate code that is added before and after the class to allow directly running the maintenance script. * Running the maintenance script directly has been deprecated since 1.40, so this boilerplate code is only to support a now deprecated method of running maintenance scripts. * This code cannot also be marked as covered, due to PHPUnit not recognising code coverage for files. * Therefore, it is best to ignore this boilerplate code in code coverage reports as it cannot be marked as covered and also is for deprecated code. What: * Wrap the boilerplate code (requiring Maintenance.php and then later defining the maintenance script class and running if the maintenance script was called directly) with @codeCoverageIgnore comments. * Some files use a different boilerplate code, however, these should also be marked as ignored for coverage for the same reason that coverage is not properly reported for files. Bug: T371167 Change-Id: I32f5c6362dfb354149a48ce9c28da9a7fc494f7c
* maintenance: Use expression builder instead of raw sqlUmherirrender2024-07-221-1/+2
| | | | | Bug: T361023 Change-Id: Ieb229d8088cb1ff3f03e44f7ac99eb612f48bc7b
* Use expression builder to avoid raw sql via BETWEEN operatorUmherirrender2024-04-211-6/+7
| | | | | | Replace BETWEEN with >= and <= operator Change-Id: Ic21b6f4cc11c773c967d9d4c5f20e762c2ff9629
* maintenance: Migrate to IDatabase::newUpdateQueryBuilderUmherirrender2024-04-141-2/+6
| | | | | Bug: T353219 Change-Id: Ic278c8534dad40a3f34674db2d5fbfbca5984da8
* maintenance: Introduce getReplicaDB() and getPrimaryDB()Amir Sarabadani2024-01-181-1/+1
| | | | | | | | | | | | And start using them instead of wfGetDB(), LB/LBF connection methods or worse, $this->getDB(). $this->getDB() reuses the database object regardless of whether you're calling a replica or primary, leading to returning a replica on a primary and other way around. Bug: T330641 Change-Id: I9e2cf85ca277022284fc26b9f37db57bd12aaa81
* maintenance: Migrate to DeleteQueryBuilderAmir Sarabadani2024-01-021-2/+8
| | | | | Bug: T353219 Change-Id: Iecb55ab3f905ee9ed4e32e9cbb58c36f8cacf669
* Use thousands separators in selected integer literalsTim Starling2023-12-121-3/+3
| | | | | | | | | | For readability. Allowed since PHP 7.4. I searched for integer literals of 6 or more digits, and also changed some nearby smaller numbers for consistency. Bug: T353205 Change-Id: I8518e04889ba8fd52e0f9476a74f8e3e1454b678
* Namespace remaining files under includes/deferredJames D. Forrester2023-11-221-0/+1
| | | | | Bug: T166010 Change-Id: Ibd40734b96fd2900e3ce12239d09becfb4150059
* Replace more single-value $db->buildComparison() with $db->expr()Bartosz Dziewoński2023-10-221-4/+4
| | | | | | | A few more fairly simple cases that don't quite match the regexp in I2cfc3070c2a08fc3888ad48a995f7d79198cc336 or required other tweaks. Change-Id: I5438c777344e9ba07f3b62a452fce9ec63baa48a
* Re-apply "Remove allowances for missing `redirect` rows"Bartosz Dziewoński2023-10-181-11/+2
| | | | | | | | | | | This reverts commit c5f4ffd4e6304ab0e99c74a42d2167520b0c25e0, re-applies commit b0fe2c41111b0d7ef6535ed5674a7add2d786af8. WikiPage::getRedirectTarget() needs to still allow missing rows, but for a different reason. Bug: T348881 Change-Id: I6e1fd823fbe140819c28096d5adc41cd15bcc8c0
* Revert "Remove allowances for missing `redirect` rows"Bartosz Dziewoński2023-10-131-2/+11
| | | | | | | | | This reverts commit b0fe2c41111b0d7ef6535ed5674a7add2d786af8. Reason for revert: Causing test failures in the UserMerge extension. Bug: T348881 Change-Id: I35e82df7a7f95150927dc6e4ad68588c3400b63f
* Remove allowances for missing `redirect` rowsBartosz Dziewoński2023-10-031-11/+2
| | | | | | | | | | After the other changes in T346290 there must always be a `redirect` table row for each page with `page_is_redirect=1`. The only place that needs to handle missing rows is the migration script fixInconsistentRedirects.php. Bug: T346290 Change-Id: I7e991aa5a33be37e0d6c9ef0900306706c171466
* installer: Add database updater for 2008/2011 redirect schema changesBartosz Dziewoński2023-09-211-4/+6
| | | | | | | | | | | | | | | | | | | In 2008, the `redirect` table was added, and in 2011, it gained the fields `rd_interwiki` and `rd_fragment`. We have never performed proper maintenance for those changes, instead relying on code in WikiPage to update it when the page was visited, or on an optional run of refreshLinks.php. I would like to remove the code in WikiPage, so we probably need to perform this maintenance in the database updater. You know, for the millions of people who have been dutifully upgrading their MediaWiki installations since 2008, but never visited the pages there. The script is a trimmed-down version of refreshLinks.php, without all the weird stuff, and using a better index for the queries. Bug: T346290 Change-Id: Iea251d2737b2fb472c4efb060ad2b97735b4ac53
* maintenance: Begin using `Maintenance::getServiceContainer()`Derick Alangi2023-09-041-2/+2
| | | | | | | | | | | Maintenance class provides a method for getting a fresh reference of the MW services container instance. Let's make use of these in maintenance scripts now that we have it. NOTE: There are still some static methods like in refreshLinks.php that makes use of services that we can't use this method for now. Change-Id: Idba744057577896fc97c9ecf4724db27542bf01c
* Fix various typos and documentation issuesMatěj Suchánek2023-08-271-1/+1
| | | | Change-Id: I2cd4b647c01d84cfe0e1b4d55e155ced8c918b17
* refreshLinks: Use join instead of subquery for dfnCheckInterval()Func2023-08-251-10/+10
| | | | | | From now this script is fully migrated to the select query builder method. Change-Id: I61623632d9f61fadf58bbda62fcd3be38690b641
* refreshLinks: Fix refreshing pages in categoryFunc2023-08-241-1/+3
| | | | | | | | | | Follow up to commit 49e56ae11, the `page_id` field should always be selected since we use it later. Sorry, I only noticed this issue when I ran it with the `--verbose` option. Bug: T344402 Change-Id: Ia8a3affea3324955a94ba5b2cd7a9fb39596cc44
* refreshLinks: Introduce `--touched-only` optionFunc2023-08-221-0/+7
| | | | | | | So that we can only fix pages that has been touched after last update. Bug: T344402 Change-Id: I141e8c9c36801373f89141155ed5124ca2234388
* refreshLinks: Skip DFN if the namespace option is givenFunc2023-08-221-20/+8
| | | | | | | | | This feature can not support query by namespace: only few link tables have the `xxx_from_namespace` field, and we are looking for non-existing pages. Bug: T344402 Change-Id: I21485e2ce843489072a0d6dbeec621ceec9fe6ae
* refreshLinks: Improve efficiency of page filteringFunc2023-08-221-168/+82
| | | | | | | | | | | | | | | | | | | | | Use select queries for page IDs, so we can avoid loading and checking each page via the WikiPage object. Also, reused the doRefreshLinks() method for refreshCategory(), so the fixRedirect() check also works for this case. Other behavioral changes: - Pass the start/end options to deleteLinksFromNonexistent() even when not in the `dfn-only` mode, since we may want to run different intervals in parallel to save time, and we don't need DFN without intervals. - fixRedirect() now won't delete entries for nonexistent pages, since the page filtering method changed to use select queries, and the deleting is covered by deleteLinksFromNonexistent(). - Removed the clearing of link cache, which was added to control the memory usage in 2006, but now LinkCache uses a MapCacheLRU. Bug: T344402 Change-Id: Iaefeeb0391393a2273edfa0f32d4f75ff4b7b22b
* refreshLinks: Remove unrelated check on the `tracking-category` optionFunc2023-08-171-5/+1
| | | | | | | | | | Tracking category keys are something like `broken-file-category`, not category page name for Title::makeTitleSafe(). This partially reverted commit 06e2d0e874408b24a384cfba6d7f96a88197f66e Bug: T331473 Change-Id: Ic744a58ef56981c3aecc4e7cf5322b77894a9249
* Simplify WHERE conditions with field IS NULLUmherirrender2023-07-241-2/+2
| | | | | | Reduce raw sql fragments on simple compares Change-Id: I3f2340dfdbf5197cc22546911e6c5653dc5a6269
* maintenance: Switch simple calls of Database::select to SQBAmir Sarabadani2023-07-191-36/+36
| | | | | | | Done semi-automatically via a php parser written on top of ANTLR4. Bug: T311866 Change-Id: I33f5b6703c0aa9c80c907a21c2a770e30642edd3
* refreshLinks: set a causeAction for SecondaryDataUpdatesDavid Causse2023-06-121-0/+1
| | | | | | | | Knowning the reason why a secondary update was triggered is an useful information for debugging. Some LinksUpdate hook consumers might also want to fine-tune their behaviors based on this value. Change-Id: I19c0620e409b31995080ee0111b0b78782276563
* refreshLinks: Add verbose optionsamtar2023-03-071-0/+13
| | | | | | | | | | Add verbose output to refreshCategory() with total pages to refresh, and per-page refresh status. Add `Title::makeTitleSafe` to the passed `tracking-category` before passing to refreshTrackingCategory(). Bug: T331473 Change-Id: I3234b1560156813b95355754de2212508f7ee6af
* TrackingCatgories: Change doc from Title to LinkTargetUmherirrender2023-03-021-3/+4
| | | | | | | Repair maintenance script (from cba68e4) while testing Follow-Up: I697ce188a912e445a6a748121575548e79aabac6 Change-Id: Id0cc2cafbe5780f11855d0cf608296f2b331e1ee
* Reorg: Namespace the Title classJames D. Forrester2023-03-021-0/+1
| | | | | | | | | | | | | | | | | | | This is moderately messy. Process was principally: * xargs rg --files-with-matches '^use Title;' | grep 'php$' | \ xargs -P 1 -n 1 sed -i -z 's/use Title;/use MediaWiki\\Title\\Title;/1' * rg --files-without-match 'MediaWiki\\Title\\Title;' . | grep 'php$' | \ xargs rg --files-with-matches 'Title\b' | \ xargs -P 1 -n 1 sed -i -z 's/\nuse /\nuse MediaWiki\\Title\\Title;\nuse /1' * composer fix Then manual fix-ups for a few files that don't have any use statements. Bug: T166010 Follows-Up: Ia5d8cb759dc3bc9e9bbe217d0fb109e2f8c4101a Change-Id: If8fc9d0d95fc1a114021e282a706fc3e7da3524b
* Use more narrow database interfaces in maintenance scriptsthiemowmde2023-02-271-3/+3
| | | | | | | | This makes this code easier to read and to maintain because it's more obvious why a DB connection is passed. For now this patrch focusses exclusively on private methods. Change-Id: Id60dc90b124f4cae1dfbede990f45e3c69491a25
* Merge "Add --before-timestamp option to refreshLinks.php"jenkins-bot2023-02-211-5/+21
|\
| * Add --before-timestamp option to refreshLinks.phpKunal Mehta2023-02-191-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to be able to refresh pages that haven't been updated in quite a while to ensure any MediaWiki parsing changes, etc. get reflected in links tables. This information is already tracked in the page.page_links_updated field, which we'll filter with. We can't actually do it in SQL because there's no index on the column, but it gets loaded by WikiPageFactory::newFromID(), so we simply check it in the fixRedirect() and fixLinksFromArticle() functions. == Test plan == * Run plain `refreshLinks.php` and see that all page_links_updated fields have been updated to now (this patch doesn't break existing functionality). * Backdate some page_links_updated fields. * Run refreshLinks.php --before-timestamp X, where X is between your backdated values and now. * Observe that only the backdated pages have had their page_links_updated modified to now. Bug: T159512 Change-Id: I695d971ef7cbabddda3125361975be0f94dabf4c
* | refreshLinks: Use namespaceCond()Kunal Mehta2023-02-181-4/+1
|/ | | | | | It does the same thing. Change-Id: I9c3554487e0207ef6df95cbde309d36cc610aa05
* ParserCache: Improve docs for cache type and purgeParserCache.phpTimo Tijhof2023-01-301-4/+1
| | | | Change-Id: I3301ff90a135bd0103c1dfc7e86f1dd1ba245a5a
* Merge "Use buildComparison() instead of raw SQL in more maintenance scripts"jenkins-bot2022-12-011-3/+4
|\
| * Use buildComparison() instead of raw SQL in more maintenance scriptsBartosz Dziewoński2022-11-151-3/+4
| | | | | | | | | | Bug: T321422 Change-Id: Ibe46e5df64a3a6a6e8042a56e10aa286dd3797dd
* | Various doc fixes about false on method arguments/return typesUmherirrender2022-11-101-1/+1
|/ | | | | | Doc-only changes Change-Id: I5177f582ae7ee70c357e9389fed14819faf79463
* maintenance: Use $this->waitForReplication()Amir Sarabadani2022-10-241-10/+7
| | | | | | | This adds reconfiguring db pools in case a replica gets depooled Bug: T298485 Change-Id: Id052ce8ed45c51e51b071778858d27b48605bf93
* Replace trivial usages of code in strings with concatenationThiemo Kreuz2022-08-261-3/+3
| | | | | | | This is really hard to read. What is code, what is string? These places are so simple, they really don't need the "{$var}" syntax. Change-Id: I589dedb8c0193eec4eef500bbb896b5b790b727b
* Introduce `Redirect(Lookup&Store)` services to handle redirectsDerick Alangi2021-12-011-1/+1
| | | | | | | | | | | | | | | | The concept of a redirect chain didn't really work for a value of max redirect > 1. In the ideal world, we just want to have a source which points to target (source -> target) discarding the concept of a redirect chain completely. Having something like: source -> target -> target1 -> target2 doesn't really work well with the current database design. NOTE: Support for $wgMaxRedirect will be removed soon hence deprecation without interfaces for replacement. Bug: T290639 Change-Id: I469de6f85e405e8ddbe7abaa5b99b77cb9cf415d
* Convert TrackingCategories to a service with DIDannyS7122021-10-081-2/+1
| | | | | Bug: T247194 Change-Id: I50012e2a5e65aeee7671023d2fd5367e21e8ae67
* Replace uses of DB_MASTER with DB_PRIMARYJames D. Forrester2021-04-291-2/+2
| | | | | | Just an auto-replace from codesniffer for now. Change-Id: I5240dc9ac5929d291b0ef1c743ea2bfd3f428266
* build: Update mediawiki/mediawiki-codesniffer to 35.0.0Umherirrender2021-01-311-1/+1
| | | | Change-Id: Idb413be4b8cba8611afdc022af59810ce1a4531e
* refreshLinks.php: use hasOption() rather than getOption and assignment in ↵Reedy2021-01-281-3/+6
| | | | | | conditional Change-Id: I0f5bc2117b5d26e10f116b879d60e7c996690463
* Replace deprecated WikiPage::factory/newFromID in maintenance scriptsUmherirrender2020-11-121-3/+4
| | | | Change-Id: I5b2d4313f986484368da9b63c9a19892c2328dae
* Pass function name to database functions (maintenance scripts)Umherirrender2020-06-071-1/+1
| | | | | | Useful for logging Change-Id: I79fe037abcd74f56c935abc118d706bef0198124
* Hooks::run() call site migrationTim Starling2020-05-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Migrate all callers of Hooks::run() to use the new HookContainer/HookRunner system. General principles: * Use DI if it is already used. We're not changing the way state is managed in this patch. * HookContainer is always injected, not HookRunner. HookContainer is a service, it's a more generic interface, it is the only thing that provides isRegistered() which is needed in some cases, and a HookRunner can be efficiently constructed from it (confirmed by benchmark). Because HookContainer is needed for object construction, it is also needed by all factories. * "Ask your friendly local base class". Big hierarchies like SpecialPage and ApiBase have getHookContainer() and getHookRunner() methods in the base class, and classes that extend that base class are not expected to know or care where the base class gets its HookContainer from. * ProtectedHookAccessorTrait provides protected getHookContainer() and getHookRunner() methods, getting them from the global service container. The point of this is to ease migration to DI by ensuring that call sites ask their local friendly base class rather than getting a HookRunner from the service container directly. * Private $this->hookRunner. In some smaller classes where accessor methods did not seem warranted, there is a private HookRunner property which is accessed directly. Very rarely (two cases), there is a protected property, for consistency with code that conventionally assumes protected=private, but in cases where the class might actually be overridden, a protected accessor is preferred over a protected property. * The last resort: Hooks::runner(). Mostly for static, file-scope and global code. In a few cases it was used for objects with broken construction schemes, out of horror or laziness. Constructors with new required arguments: * AuthManager * BadFileLookup * BlockManager * ClassicInterwikiLookup * ContentHandlerFactory * ContentSecurityPolicy * DefaultOptionsManager * DerivedPageDataUpdater * FullSearchResultWidget * HtmlCacheUpdater * LanguageFactory * LanguageNameUtils * LinkRenderer * LinkRendererFactory * LocalisationCache * MagicWordFactory * MessageCache * NamespaceInfo * PageEditStash * PageHandlerFactory * PageUpdater * ParserFactory * PermissionManager * RevisionStore * RevisionStoreFactory * SearchEngineConfig * SearchEngineFactory * SearchFormWidget * SearchNearMatcher * SessionBackend * SpecialPageFactory * UserNameUtils * UserOptionsManager * WatchedItemQueryService * WatchedItemStore Constructors with new optional arguments: * DefaultPreferencesFactory * Language * LinkHolderArray * MovePage * Parser * ParserCache * PasswordReset * Router setHookContainer() now required after construction: * AuthenticationProvider * ResourceLoaderModule * SearchEngine Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
* Fix various MediaWiki.WhiteSpace.SpaceBeforeSingleLineComment.NewLineCommentReedy2020-05-211-1/+2
| | | | Change-Id: I50c7c93f1534e966224f98a835ca01f93eb9416d
* Fix PSR12.Properties.ConstantVisibility.NotFound in maintenance/Reedy2020-05-091-1/+1
| | | | Change-Id: Ib0f081f7b278fdd3f4083fc5020bcac97f6015b4
* Replace wfWaitForSlaves() with LBFactory::waitForReplication()Reedy2020-05-021-7/+10
| | | | Change-Id: I337147d61e2ec686a8672d0340dff4b6783f78cd