aboutsummaryrefslogtreecommitdiffstats
path: root/maintenance/findBadBlobs.php
Commit message (Collapse)AuthorAgeFilesLines
* findBadBlobs: Allow for timestamp based search via --scan-toAmir Sarabadani2025-03-141-17/+46
| | | | | | | | Very similar to --scan-from. There is a bit of complexity involved with allowing combination of --limit and this but locally it seems to work. Bug: T351953 Change-Id: Ie218717a9f00318392e7a78f25ccfadba4b30503
* maintenance: Also check for utf-8 encoding in findBadBlobsAmir Sarabadani2025-03-051-3/+8
| | | | | | | | | | | | Currently, if you run this on broken revisions that trigger an exception, it passes on them. For example: mwscript findBadBlobs.php --wiki eowiki --revisions 4062 passes while loading the revision in web triggers an exception. These are clearly bad blobs and should be treated as such. Bug: T351953 Change-Id: Id3164322269efd8aa67a1496d4019d50f433062e
* add `use MediaWiki\Maintenance\Maintenance` to some maintenance classesNovem Linguae2024-12-041-0/+1
| | | | | | | | | | | | | F–P. Still need to do P–Z. there's a couple spots where I added `use MediaWiki\Maintenance\LoggedUpdateMaintenance;` or similar instead. some of the existing "use" blocks were in weird spots (e.g. above the copyright docblock, or too far down). i didn't move those because they are out of scope for this patch. Change-Id: I5b6a8f3eae5be85d67bccfcce31c0c2027850f45
* maintenance: avoid calling Maintenance::setDBProvider() when not neededAaron Schulz2024-10-241-1/+0
| | | | | | | | Injecting the connection provider from the service container does not seem to serve much purpose since that is the default anyway. Bug: T377800 Change-Id: Iacd16023be6dba0e4f90b5d720cae190fd9a0c7c
* Use explicit nullable type on parameter argumentsUmherirrender2024-10-161-1/+1
| | | | | | | | | | | Implicitly marking parameter $... as nullable is deprecated in php8.4, the explicit nullable type must be used instead Created with autofix from Ide15839e98a6229c22584d1c1c88c690982e1d7a Break one long line in SpecialPage.php Bug: T376276 Change-Id: I807257b2ba1ab2744ab74d9572c9c3d3ac2a968e
* Exclude boilerplate maintenance code from code coverage reportsDreamy Jazz2024-08-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Why: * Maintenance scripts in core have bolierplate code that is added before and after the class to allow directly running the maintenance script. * Running the maintenance script directly has been deprecated since 1.40, so this boilerplate code is only to support a now deprecated method of running maintenance scripts. * This code cannot also be marked as covered, due to PHPUnit not recognising code coverage for files. * Therefore, it is best to ignore this boilerplate code in code coverage reports as it cannot be marked as covered and also is for deprecated code. What: * Wrap the boilerplate code (requiring Maintenance.php and then later defining the maintenance script class and running if the maintenance script was called directly) with @codeCoverageIgnore comments. * Some files use a different boilerplate code, however, these should also be marked as ignored for coverage for the same reason that coverage is not properly reported for files. Bug: T371167 Change-Id: I32f5c6362dfb354149a48ce9c28da9a7fc494f7c
* maintenance: Use expression builder instead of raw sqlUmherirrender2024-07-221-1/+1
| | | | | Bug: T361023 Change-Id: Ieb229d8088cb1ff3f03e44f7ac99eb612f48bc7b
* Maintenance: Print errors from StatusValue objects in a consistent wayBartosz Dziewoński2024-06-121-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow Maintenance::error() and Maintenance::fatalError() to take StatusValue objects. They now print each error message from the status on a separate line, in English, ignoring on-wiki message overrides, as wikitext but after parser function expansion. Thoughts on the previously commonly used methods: - $status->getMessage( false, false, 'en' )->text() Almost the same as the new output, but it allows on-wiki message overrides, and if there is more than one error, it prefixes each line with a '*' (like a wikitext list). - $status->getMessage( false, false, 'en' )->plain() - $status->getWikiText( false, false, 'en' ) As above, but these forms do not expand parser functions such as {{GENDER:}}. - print_r( $status->getErrorsArray(), true ) - print_r( $status->getErrors(), true ) These forms output the message keys instead of the message text, which is not very human-readable. The error messages are now always printed using error() rather than output(), which means they go to STDERR rather than STDOUT and they're printed even with the --quiet flag. Change-Id: I5b8e7c7ed2a896a1029f58857a478d3f1b4b0589
* maintenance: Replace unnecessary uses of LBFactory and LoadBalancerBartosz Dziewoński2024-01-231-41/+11
| | | | | | | | | | * Change `$services->getDBLoadBalancerFactory()->waitForReplication()` to `$this->waitForReplication()` * Change various complicated expressions to `$this->getReplicaDB()` and `$this->getPrimaryDB()` * Remove unused variables Change-Id: Ia857be54938a32bb6288dcdf695a35cd38761c3c
* Migrate some usages of Database::update() to UpdateQueryBuilderAlexander Vorwerk2024-01-171-6/+5
| | | | | Bug: T353219 Change-Id: I98bf4c2e2c3023fba226ac10826e52a1108b8aea
* Introduce ArchiveSelectQueryBuilderAmir Sarabadani2023-09-071-18/+8
| | | | | | | Similar to RevisionSQB (Ifd690dc8f030) Bug: T344971 Change-Id: Ic520bcf09f4cc95ebd6a3990cff46dec5b7cd350
* Introduce RevisionSelectQueryBuilderAmir Sarabadani2023-09-061-16/+9
| | | | | | | | | Deprecating RevisionStore::getQueryInfo() and cleaning up a lot of code Also removing a brittle test that wasn't really testing anything. Bug: T344971 Change-Id: Ifd690dc8f030f86e3567a717eaeb830cb6dc703b
* maintenance: Begin using `Maintenance::getServiceContainer()`Derick Alangi2023-09-041-2/+1
| | | | | | | | | | | Maintenance class provides a method for getting a fresh reference of the MW services container instance. Let's make use of these in maintenance scripts now that we have it. NOTE: There are still some static methods like in refreshLinks.php that makes use of services that we can't use this method for now. Change-Id: Idba744057577896fc97c9ecf4724db27542bf01c
* Reorg: Move Status to MediaWiki\Status\Amir Sarabadani2023-08-251-0/+1
| | | | | | | | | | This class is used heavily basically everywhere, moving it to Utils wouldn't make much sense. Also with this change, we can move StatusValue to MediaWiki\Status as well. Bug: T321882 Depends-On: I5f89ecf27ce1471a74f31c6018806461781213c3 Change-Id: I04c1dcf5129df437589149f0f3e284974d7c98fa
* Remove unused arguments to private functionsUmherirrender2023-02-081-3/+2
| | | | | | Found by phan dead detection Change-Id: I93379b7b9a733206d0e53add04fcdb9478c58755
* Remove unused local variable assignmentUmherirrender2023-02-041-2/+0
| | | | | | Dead code found by phan Change-Id: I9fc404d546a4fb1c61394cb6359eb774fd94383a
* Use buildComparison() instead of raw SQL in more maintenance scriptsBartosz Dziewoński2022-11-151-3/+4
| | | | | Bug: T321422 Change-Id: Ibe46e5df64a3a6a6e8042a56e10aa286dd3797dd
* maintenance: Use SelectQueryBuilder to construct queriesDerick Alangi2022-07-221-48/+41
| | | | | | | Part 2 of migrating files in `maintenance/` from IDatabase::select() to SelectQueryBuilder. Change-Id: I73eda0e4429016588bcfc6b3b490cb3fc0f5b711
* Replace uses of DB_MASTER with DB_PRIMARYJames D. Forrester2021-04-291-1/+1
| | | | | | Just an auto-replace from codesniffer for now. Change-Id: I5240dc9ac5929d291b0ef1c743ea2bfd3f428266
* Add Maintenance::waitForReplication()Gergő Tisza2021-03-181-4/+0
| | | | | | | | | | | Common utility method for maintenance scripts that's a little more clever than LBFactory::waitForReplication(). Previously it was included in Maintenance::commitTransaction but that logs an error when there is no uncommitted change, and one might want to commit more often and wait for replicas to catch up only after some amount of commits. Change-Id: I3394536eea01eb982a4a2033fd2062bc67f6bdc1
* maintenance: Fix errors in parameter handling and output of findBadBlobsdaniel2021-01-221-4/+5
| | | | | | | | | | | | | | | - in the instructions for how to extract IDs of bad revisions using grep, the expression was looking for the wrong string. - output for bad revisions didn't include the timestamp, making it harder to determine the duration of the problem that cause the bad revisions. - documentation advertized YYYY-MM-DD_HH:MM:SS as an allowed data format, but it wasn't actually supported. Bug: T272540 Change-Id: Iac0c184c5a7008aec3b0899df30c6fb6644b23d9
* Merge "Add FindMissingActors script."jenkins-bot2020-09-231-13/+1
|\
| * Add FindMissingActors script.daniel2020-09-221-13/+1
| | | | | | | | | | | | | | | | | | This allows bad actor IDs to be overwritten with some default. This solves the problem of rows in tables like ipblocks, logging, or revision not being found due to a failing join against the actor table. Bug: T261325 Change-Id: Ibc554d0b6f52e7b30cdde5138ac165774831ec36
* | Have findBadBlobs.php require Maintenance.php rather than cleanupTable.incBill Pirkle2020-09-221-1/+1
|/ | | | | | | | | | | THe findBadBlobs.php maintenance script unnecessarily required cleanupTable.inc instead of the typical Maintenance.php. While this worked (because cleanupTable.inc requires Maintenance.php), it was slightly confusing and slightly less efficient. Change to just require Maintenance.php instead. Bug: T263604 Change-Id: I42dfb5220b701ec90f39e9ad905c1e32c9c28904
* includes: Use expression assignment operator += or |= where possibleUmherirrender2020-07-311-1/+1
| | | | | | It is easier to read. Change-Id: Ia3965b80153d64f95b415c6c30f526efa252f554
* findBadBlobs: better separate scan and mark modes.daniel2020-06-301-16/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | This makes the following changes to the findBadBlobs utility: - rename --from-date to --scan-from, to match the intended use. - require the usage of --revisions with --mark, so revisions cannot be marked directly when found by a scan. - catch any exception when testing for bad blobs, casting a wider net. - change the output format, so the IDs of bad revisions can easily be extracted by command line tools for further processing. - warn when trying to mark blobs that can successfully we read. The idea is to allow detection of blobs that are "bad" in a large variety of ways, including due to misconfiguration, while at the same time making sure that blobs do not get marked as bad due to temporary outages. The intended usage of findBadBlobs is to first scan a potentially problematic set of revisions using --scan-from, review to errors found, and then determine which of the revisions should be marked as bad. Once the bad revisions have been identified, a list with their IDs can be extracted from the output, and supplied back to findBadBlobs via the --revisions option. Bug: T251778 Change-Id: I47c11190b665c1dac88db32ee2bf683728cb3dc6
* Fix various MediaWiki.WhiteSpace.SpaceBeforeSingleLineComment.NewLineCommentReedy2020-05-211-1/+2
| | | | Change-Id: I50c7c93f1534e966224f98a835ca01f93eb9416d
* findBadBlobs: Force rev_timestamp indexdaniel2020-05-041-1/+5
| | | | | | | | Force the database to use the rev_timestamp index. MySql/MariaDB was coming up with very slow query plans. Bug: T205936 Change-Id: Iab68253c62a51463ba4afd072cd7bff2d1fafdde
* RevisionStore: improve error handling in newRevisionsFromBatchdaniel2020-05-031-18/+30
| | | | | | | | | When for some reason we can't determine the title for a revision in the batch, this should not trigger a fatal TypeError, but handled gracefully, with helpful information included in the error message. Bug: T205936 Change-Id: I0c7d2c1fee03d1c9208669a9b5ad66612494a47c
* Allow specific revision IDs to be passed to markBadBlobs.phpdaniel2020-04-171-13/+145
| | | | | | | | This adds a --revisions paramter to markBadBlobs.php that can be used to specific individual revisions, instead of scanning by date. Bug: T205936 Change-Id: Ie1a907f2c15f1d4a85affff2701ff2289bfa77ea
* Add findBadBlobs script.daniel2020-04-171-0/+371
This script scans for content blobs that can't be loaded due to database corruption, and can change their entry in the content table to an address starting with "bad:". Such addresses cause the content to be read as empty, with no log entry. This is useful to avoid errors and log spam due to known bad revisions. The script is designed to scan a limited number of revisions from a given start date. The assumption is that database corruption is generally caused by an intermedia bug or system failure which will affect many revisions over a short period of time. Bug: T205936 Change-Id: I6f513133e90701bee89d63efa618afc3f91c2d2b