aboutsummaryrefslogtreecommitdiffstats
path: root/includes/parser/MWTidy.php
Commit message (Collapse)AuthorAgeFilesLines
* Namespace all remaining classes in includes/parserJames D. Forrester2024-10-151-0/+5
| | | | | Bug: T353458 Change-Id: If02cc9b1ff78e26c1cf8c91ee4695845eb133829
* Replace deprecated MWExceptionDaimona Eaytoy2023-06-091-1/+0
| | | | | Bug: T328220 Change-Id: I66be7a6dd752d6b9c254beb65f4eb5ace3c89776
* Deprecate MWTidy and TidyDriverBase::supportsValidate()C. Scott Ananian2021-03-161-0/+1
| | | | | | | | | Also copied the tests that used to be in TidyTest into RemexDriverTest, so that we're not losing coverage when MWTidy is eventually removed. Bug: T198214 Change-Id: I0b301f6c98d0943ce4b6dc224f1066cb7bf244d1
* Introduce Tidy serviceC. Scott Ananian2021-03-151-7/+3
| | | | | | Refactor the old MWTidy singleton as a DI service. Change-Id: I95605ea5fd22f53a7f90fe07a6a73fa6c959597a
* Remove all methods of MWTidy except for MWTidy::tidy()C. Scott Ananian2020-08-171-40/+3
| | | | | | | These methods were either @internal or deprecated in 1.35 Bug: T198214 Change-Id: Ica1d1fdfd2a23a2040eac90c71f6211a4513c916
* Deprecate a few more tidy-related methodsC. Scott Ananian2020-05-011-0/+4
| | | | | | | | | | | | | Hard-deprecate ParserOptions::getTidy(), since it always returns true and is rarely used. Code search: https://codesearch.wmflabs.org/search/?q=getTidy%5C%28&i=nope&files=&repos= Soft deprecate most methods of MWTidy; folks should use MWTidy::tidy() as the entry point here. Code search: https://codesearch.wmflabs.org/search/?q=MWTidy&i=nope&files=&repos= Bug: T198214 Change-Id: I3584181070da7ed4888beaaf04e083114aca1eab
* Remove codepaths which ran parser in 'untidy' modeC. Scott Ananian2020-04-131-15/+1
| | | | | | | | Disabling tidy has been deprecated since 1.33. This cleans up the code paths which still used untidy output. Bug: T198214 Change-Id: I821ef3b8f59b272d983583d407b2f0794fe1e791
* Remove most support for configuring Tidy, including RaggettC. Scott Ananian2018-11-151-55/+9
| | | | | | | | | | Remex is pure PHP so there is no reason to use an external tidy any more. Configuration variables and implementation classes were deprecated in 1.32 or earlier. We've kept only $wgTidyConfig which can be used for experimental features or debugging Remex. Bug: T198214 Change-Id: I99d48f858d97b6e1d1e6cd76a42c960cc2c61f9f
* Hard deprecate $wgTidyConfig['driver'] = 'disabled'C. Scott Ananian2018-10-251-0/+1
| | | | | | | | This was already deprecated in the release notes, and is not used in production, but I'd overlooked adding an appropriate hard deprecation notice in MWTidy::factory() to notify downstream users. Change-Id: I8f4d8154a1d8a233017f54f0fb4bcfdf4a0373e1
* Hard-deprecate the $wgUseTidy optionC. Scott Ananian2018-09-201-0/+2
| | | | | | | | | This has been soft-deprecated since MW 1.26; this hard-deprecation sets the stage for future removal of this old cruft. Bug: T198214 Depends-On: Idf246d05d116f63a73105b50a1929a7721fbe7b9 Change-Id: I2e7d990da1da378eb6e828d4b3c0f5a41791dd92
* tidy: Remove obsolete Depurate and Balancer driversKunal Mehta2018-05-081-6/+0
| | | | | | | | | | | | | | The Html5Depurate driver was intended to be used with an external Java service, but it never gained traction due to deployment concerns. The Html5Internal (Balancer) driver was originally intended for use with the balanced templates proposal and could also handle tidying. But it was tightly coupled to MediaWiki, so part of it was used as the basis of the RemexHtml library. Remex most likely can also implement the balanced templates proposal, so there isn't any reason to keep the Balancer code around anymore, Change-Id: I8542d69e9cdbf0e2fb7ebbb919933a64c1b8c293
* Immediately drop wgValidateAllHtml and related codeJames D. Forrester2018-04-101-21/+0
| | | | | Bug: T191670 Change-Id: If13d02ee1b30fec1c701226af9d363c6e08b3737
* parser: Update MWTidy::checkErrors() error messageTimo Tijhof2018-03-211-1/+1
| | | | | | | | | | When setting the following on PHP 7, the produced error message did not make sense (references something about HHVM). > $wgValidateAllHtml = true > $wgTidyConfig = ['driver' => 'RemexHtml']; Change-Id: I5f14505639a79aca66f570a9a00c38cdea0cc1ba
* Add benchmarkTidy.php, to benchmark tidy driversTim Starling2017-04-211-1/+1
| | | | | | Plus representative input file Change-Id: I254793fc55c57a98c07ae1e4c27e6005965c9a20
* Add RemexHtml to the list of available Tidy driversTim Starling2017-03-091-0/+3
| | | | Change-Id: I5a87a6ed24ca3ef7c5fdb21e74f9eb410bf74b4c
* Add/update doc blocks for MWTidyReedy2016-07-291-2/+11
| | | | Change-Id: I0b87e119048fd993f8bfda25a6c6b744d59804d1
* Add MWTidy::factory()Tim Starling2016-07-261-21/+32
| | | | | | | A convenient factory function to eliminate code duplication in ParserMigration's MigrationEditPage::tidyParserOutput(). Change-Id: I058912885025e7a9402912236c65c44e32ef036e
* Hide marked empty elements by default (stage 1)Tim Starling2016-07-141-18/+0
| | | | | | | | | | | | | | | | | | | | | | | We originally imagined rolling out the display of empty elements simultaneously with the Html5Depurate, but now we have added support for marking empty elements to Html5Depurate and plan on having some sort of longer migration period. So, move the relevant CSS to content.css, and remove the concept of CSS dependant on tidy driver. Add a body class which will allow the effect to be toggled in a gadget or extension. Actual toggling in the CSS will be in the stage 2 patch, to be deployed after the varnish cache and parser cache have expired. I originally imagined that there would be a gadget that overrides the rule with an !important selector, but that method does not allow you to recover the original display property, which is often overridden by the style attribute or site CSS to be "inline". Also, in RaggettWrapper, switch to the new class mw-empty-elt, following Html5Depurate, instead of mw-empty-li. The old class will be removed in the stage 2 patch. Change-Id: Ic0f432c43a006629ca5a1a7c2dda3552ceb4dc4f
* Rewrite TidySupport and add option --use-tidy-configTim Starling2016-07-121-0/+2
| | | | | | | | | | | * Have TidySupport provide $wgTidyConfig instead of the legacy globals * Add --use-tidy-config option to parserTests.php. This tells TidySupport to use the tidy configuration from LocalSettings.php instead of the traditional safe defaults. * Add a way for TidySupport to disable tidy via $wgTidyConfig, using driver=>disabled Change-Id: Ie76e68e2d5238d0a1aef49a1a815c0d1cd8bfdae
* Hook up Balancer as a Tidy implementation.C. Scott Ananian2016-07-121-0/+3
| | | | | | | This is an HTML5-compliant parse/serialize tidy implementation, with well-delineated hacks to support the <p>-wrapping done by legacy tidy. Change-Id: I4fd433fd6f1847061b0bf4b3e249c918720d4fae
* Convert all array() syntax to []Kunal Mehta2016-02-171-4/+4
| | | | | | | | | | Per wikitech-l consensus: https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html Notes: * Disabled CallTimePassByReference due to false positives (T127163) Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
* Client-side migration for empty li preservationTim Starling2015-10-281-0/+18
| | | | | | | | | | | | | | | | | | | | It is desirable in terms of user-friendly syntax to display an empty list item if the user adds one to the source. However, we suspect that this change will break the rendering of existing templates. So, preserve the empty <li> element, but style it with display:none so that there is no user-visible change. Changes can then be observed with a user script, then eventually the CSS can be removed so that the desired behaviour will be user visible. This is imagined as a staged deployment of T89331, i.e. it is better to resolve differences with Html5Depurate one at a time instead of deploying it all at once. The CSS module is specified in parser/MWTidy.php since the tidy driver hierarchy is not meant to be so closely tied to the MW environment. Bug: T49673 Change-Id: Ifb44b782c617240e3de73dcdf76c8737c7307d94
* Fixed spacingumherirrender2015-09-261-2/+2
| | | | | | | | | | - Removed space after cast - Removed spaces in array index - Removed double spaces - Added spaces around string concat - Fixed mixed tabs and spaces at begin of line Change-Id: I38e849723f055d2d4c05cba72f5c245a28e8d5da
* Add Html5Depurate tidy driverTim Starling2015-09-111-1/+5
| | | | | | Also document input format for MWTidy::tidy(). Change-Id: I77071d3db0524695c2baf9a4670ca2455438c83d
* Abstract and refactor Tidy supportTim Starling2015-09-101-248/+64
| | | | | | | | | | | | | | * Split tidy implementations into a class hierarchy * Bring all tidy configuration into a single associative array and deprecate the old configuration. * Remove $wgAlwaysUseTidy This is preparatory to replacement of Tidy (T89331). I used the name "Raggett" for things relating to Dave Raggett's Tidy, since if we use "tidy" to mean the new abstract system as well as Raggett's tidy, it gets confusing. Change-Id: I77af1a16cbbb47fc226d05fb9aad56c58e8910b5
* Use a fixed marker prefix string in the Parser and MWTidyOri Livneh2015-05-311-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Generating one-time, unique strip markers hurts us in multiple ways: * The strip marker regexes don't benefit from JIT compilation, so they are slower to execute than they could be. * Although the regexes don't benefit from JIT compilation, they are still compiled, because HHVM bets on regexes getting reused. This extra work is fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off. * The size of the PCRE JIT cache is finite, and the caching of one-off regexes displaces from the cache regexes which are in fact reused. Tim's preferred solution (per his review comment on https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers. So: * Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which complements the existing Parser::MARKER_SUFFIX. * Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix(). * Deprecate Parser::getRandomString(), since it is no longer useful. * In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle, replace any occurences of \x7f with '?', to prevent strip marker forgery. \x7f is not valid input anyway. * Deprecate the $prefix parameter for StripState::__construct, since a custom prefix may no longer be specified. Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
* Remove obvious function-level profilingChad Horohoe2015-01-071-9/+2
| | | | | | | | | | | Xhprof generates this data now. Custom profiling of various sub-function units are kept. Calls to profiler represented about 3% of page execution time on Special:BlankPage (1.5% in/out); after this change it's down to about 0.98% of page execution time. Change-Id: Id9a1dc9d8f80bbd52e42226b724a1e1213d07af7
* Add lots of @throwsReedy2014-12-241-1/+2
| | | | Change-Id: I09d0c13070f966fcf23d2638d8fc1328279a5995
* Revert "Simplify MWTidy"Ori Livneh2014-12-181-51/+89
| | | | | | | | This is broken, for reasons indicated in <https://gerrit.wikimedia.org/r/#/c/180384/>. It was broken before, but I made it more broken. So revert for now, and I'll give this another stab. Change-Id: I7e67a61f7d6370f90487be6470bebe1449432a4c
* Fixed internalClean class/method existence check for HHVMAaron Schulz2014-12-101-2/+2
| | | | | | * Follows up 4f281083fda91879a77fb87d64d8a9533526bd0c Change-Id: I5fa406ed1c4f2eefd1c22e9ab90e72655f31d162
* hhvm: Check for tidy function instead of classBryan Davis2014-12-101-1/+3
| | | | | Bug: T78166 Change-Id: Ie60e23ffbafd698a3458eed1efce92d54c8d0c2a
* Simplify MWTidyOri Livneh2014-12-091-89/+51
| | | | | | | | | | | | | | | * Make the internal MWTidy::*clean() functions always return an array of two elements: the output buffer and the error buffer. * Make MWTidy::externalTidy() always read both stdout and stderr. We can read stderr after stdout because tidy.c produces output in the same order. * Remove the $stderr parameter from the private MWTidy::*clean() methods, since error output is always returned. * Merge MWTidy::phpClean and MWTidy::hhvmClean, since the difference between them is now small enough that splitting them up is not warranted. * On HHVM, MWTidy::internalTidy() always returns an empty string for the error buffer. Change-Id: I178b42d6ebdd1a5b9bd5921eb093a6c5014ffa49
* Fixed spacingumherirrender2014-12-051-1/+1
| | | | | | | | | | - Added/removed spaces around parenthesis - Added newline in empty blocks - Added space after switch/foreach/function - Use tabs at begin of line - Add newline at end of file Change-Id: I244cdb2c333489e1020931bf4ac5266a87439f0d
* Use HHVM+EZC internal tidyTim Starling2014-11-281-14/+50
| | | | | | | | | | | | | | | | | | | | EZC doesn't currently support direct access to object properties via the obj->std.properties hashtable, but tidy uses this extensively. But it turns out that for production use cases, tidy_repair_string() should be sufficient. $wgDebugTidy and $wgValidateAllHtml are not used, and no deployed extension calls MWTidy::checkErrors(). The only difference I know of is that errors from tidy (status==2) lead to the tidy output being used, rather than discarded. But TY_(ReportFatal) has very few callers in tidylib -- probably none that are reachable from stripped parser output. So, throw an exception if MWTidy::checkErrors() is requested on an HHVM instance with the tidy extension. For MWTidy::tidy(), use tidy_repair_string(). Refactor some relevant code. Bug: T758 Change-Id: I8d5b1c2c9f9ddce46d8ad099a671a2e297d256e0
* Protect MathML from Tidyphysikerwelt2014-08-221-1/+3
| | | | | | | | | | | | | | MediaWiki installations that use the setting $wgUseTidy = true; are unable to output MathML since the well defined MathML elements are filtered out by Tidy. This was reported as http://sourceforge.net/p/tidy/patches/84/ . This change hides MathML blocks from Tidy. Bug: 66516 Change-Id: Ib48b91238c3eddd6a86b62f6ce57801d7058f0d8
* Fix phpcs issues in parseraddshore2014-08-121-1/+1
| | | | | | | | This fixes all issues except for: - class names - line length Change-Id: Ie91b010d5b3eec49d3b80b6e93b125a901ef43c6
* Rename MWNamespace, MWDebug and MWTidy files to match their classTimo Tijhof2014-07-151-0/+289
Change-Id: I3e6d13ce366861c865401dde272bc2834a1de670