aboutsummaryrefslogtreecommitdiffstats
path: root/includes/tidy
Commit message (Collapse)AuthorAgeFilesLines
* Fix RemexCompatMunger infinite recursionTim Starling2017-11-172-1/+65
| | | | | | | | | | | | | | | | | | | | | When TreeBuilder requests reparenting of all child nodes of a given element, we do this by removing the existing child nodes, and then inserting the proposed new parent under the old parent. However, when a p-wrap diversion is in place, the insertion of the new parent is diverted into the p-wrap, and the p-wrap then becomes a child of the new parent, causing a reference loop, and ultimately infinite recursion in Serializer. Instead, divert the entire reparent request to the p-wrap, so that the new parent is a child of the p-wrap. This makes sense since the new parent is always a formatting element. The only caller of reparentChildren(), apart from proxies, is AAA step 17, which reparents children under the formatting element cloned from the AFE list. Left in some debug code for next time. Bug: T178632 Change-Id: Id77d21d99748e94c064ef24c43ee0033de627b8e
* Remove @codingStandardsIgnore from long linesUmherirrender2017-10-221-2/+0
| | | | | | | | | Breaks some line where the ignore is not needed. The sniff was changed upstream to be okay with long unbreakable lines in comments Change-Id: I2bbe2be7cedd4d3c0ce8dc3e62d0e268bc171876
* Improve some parameter docsUmherirrender2017-09-102-0/+8
| | | | | | Add missing @return and @param to function docs and fixed some @param Change-Id: I810727961057cfdcc274428b239af5975c57468d
* Use short type bool/int in param documentationUmherirrender2017-08-202-6/+6
| | | | | | Enable the phpcs sniffs for this and used phpcbf Change-Id: Iaa36687154ddd2bf663b9dd519f5c99409d37925
* update mediawiki-codesniffer to 0.11.0 and fix issuesWMDE-Fisch2017-08-111-6/+6
| | | | | | | | - mostly auto fixes - some too long lines fixed - ignore amp space in one case passing by reference Change-Id: I6472f83bc3cbf4bd629d83050cc3319b19ec465c
* build: Update mediawiki/mediawiki-codesniffer to 0.10.1Kunal Mehta2017-07-221-1/+1
| | | | | | | | | And auto-fix all errors. The `<exclude-pattern>` stanzas are now included in the default ruleset and don't need to be repeated. Change-Id: I928af549dc88ac2c6cb82058f64c7c7f3111598a
* Remove auto-generated "Constructor" documentation on constructorsThiemo Mättig2017-07-211-2/+0
| | | | | | | | | | | Having such comments is worse than not having them. They add zero information. But you must read the text to understand there is nothing you don't already know from the class and the method name. This is similar to I994d11e. Even more trivial, because this here is about comments that don't say anything but "constructor". Change-Id: I474dcdb5997bea3aafd11c0760ee072dfaff124c
* TidyDriverBase::validate throws an exceptionaddshore2017-06-301-0/+1
| | | | Change-Id: I05e31c757ed92323ff905d993ac4d030b8aba1da
* build: Prepare for mediawiki/mediawiki-codesniffer to 0.9.0Umherirrender2017-06-261-1/+1
| | | | | | | | | | | | The used phpcs has a bug, so the version 0.9.0 could not be enforced at the moment. Will be fixed in next version, see T167168 Changed: - Remove duplicate newline at end of file - Add space between function and ( for closures - and -> &&, or -> || Change-Id: I4172fb08861729bccd55aecbd07e029e2638d311
* Hide <style> tags from TidyBrad Jorsch2017-06-132-1/+10
| | | | | | | | | Some versions of html-tidy (e.g. the one currently in use on WMF wikis) will try to move all <style> tags in the body into the head, effectively removing them for our purposes. We need to avoid that for TemplateStyles. Bug: T167349 Change-Id: I133776d16f366cad73ed30af0e5a665fdf9f5ed9
* Remove unused and unnecessary importsThiemo Mättig2017-06-121-1/+0
| | | | Change-Id: I26e623a4e4ba965c07670369a90c8a95185ea1e4
* RemexCompatMunger: fix a couple of memory leaksTim Starling2017-03-231-2/+8
| | | | Change-Id: I47578b3f73320e84a157417c288de97b5d26e18f
* Merge "RemexHtml tidy driver with p-wrapping"jenkins-bot2017-03-084-0/+674
|\
| * RemexHtml tidy driver with p-wrappingTim Starling2017-03-084-0/+674
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull in the RemexHtml library, which is an HTML 5 library I recently created. RemexCompatMunger mutates the event stream, inserting <mw:p-wrap> elements where necessary, and occasionally taking even more invasive action such as reparenting and removing nodes maintained in Serializer's tree. RemexCompatFormatter produces a MediaWiki-style serialization which is relatively compatible with existing parser tests. It also does final empty element handling, including translating <mw:p-wrap> to <p> Tests are imported from both Html5Depurate and Subbu's pwrap.js. Depends-On: I864f31d9afdffdde49bfd39f07a0fb7f4df5c5d9 Change-Id: I900155b7dd199b0ae2a3b9cdb6db5136fc4f35a8
* | Clean up remaining get_class() usesTimo Tijhof2017-03-071-1/+1
| | | | | | | | | | | | | | | | * get_class() -> __CLASS__ (same as self::class) * get_called_class() -> static::class * get_class($this) -> static::class Change-Id: I1888a1897ecf4548a2e5a67a942e5c080dd7e3d3
* | Miscellaneous indentation tweaksBartosz Dziewoński2017-02-271-7/+7
|/ | | | | | | | | | I was bored. What? Don't look at me that way. I mostly targetted mixed tabs and spaces, but others were not spared. Note that some of the whitespace changes are inside HTML output, extended regexps or SQL snippets. Change-Id: Ie206cc946459f6befcfc2d520e35ad3ea3c0f1e0
* Update Balancer to latest HTML5 specC. Scott Ananian2017-01-241-19/+44
| | | | | | | | | | | | | | | This corresponds to the 1.0.27 release of domino, and matches the latest HTML5 spec as of 2016-10-18. Changes include: * <menuitem> is no longer an empty element. * <isindex> has been removed. * Updated html5lib-tests (copied from domino 1.0.27). * Round-trip-safe serialization of <pre>/<listing>/<textarea> is only used when "tidy compatibility" mode is enabled; the behavior in the HTML5 spec no longer cleanly round trips. Change-Id: I656944b0d7bb6c3c0e4fe44fc6ebd1a4c36412ad
* Un-blacklist PhanUndeclaredVariableErik Bernhardson2017-01-181-0/+1
| | | | | | | | | | Undeclared variables are a very common error type that we want to catch as often as possible. To avoid needing to refactor a variety of global level code (mostly in old-style maintenance scripts) this ignores undeclared variables in global scope. This is still a good improvement over what was happening previously. Change-Id: I50b41d571724244552074b9408abbdf6160aca59
* RaggettWrapper: Don't use ReplacementArrayKevin Israel2016-12-271-14/+11
| | | | | | | | | | Instead, build the array and call strtr() directly. Also did some other minor cleanup, such as making replaceCallback() private now that we require at least PHP 5.5, and changing &$this to $this. Change-Id: If885df06710c76fdb35d3c7de78df7436ccb7abf
* Update weblinks in comments from HTTP to HTTPSFomafix2016-11-071-1/+1
| | | | | | | | Use HTTPS instead of HTTP where the HTTP link is a redirect to the HTTPS link. Also update some defect links. Change-Id: Ic3a5eac910d098ed5c2a21e9f47c9b6ee06b2643
* Balancer: remove unnecessary extra argumentC. Scott Ananian2016-10-121-4/+2
| | | | | | | | | | | | | | | The full HTML5 spec clones element attributes when they are added to the ActiveFormattingElements list, so that when an element on that list is later cloned and reinserted the attributes are the *original* attributes, not reflecting any changes which embedded JavaScript in an inline <script> block may have made to them since the element was pushed. However, the PHP implementation doesn't run any JavaScript so there's no way the attributes could change during balancing and there is thus no reason to keep extra copies of the attributes around. Change-Id: I89647aeb90c64701d77e862ea9e3d22b19bbdedc
* Balancer: Add a bunch of phpdoc and 2 fixmesKunal Mehta2016-10-111-4/+27
| | | | Change-Id: I0596c73cc87ec609d75aa4d8b241c2377bc4f9b1
* Fix function name caseMax Semenik2016-09-261-1/+1
| | | | Change-Id: Ibd4f682d2ed8500a50d85aae38f17281646f7c2d
* Balancer: pass configuration array to flatten instead of individual booleansC. Scott Ananian2016-08-061-15/+20
| | | | | | | This refactoring makes it easier to add additional options later without having to pass them manually through the call chain. Change-Id: I46814f17d1b338b971ab57f63c2ec75d4a6b45d5
* Balancer style tweaksTim Starling2016-08-041-235/+223
| | | | | | | | * Use for loops where appropriate, instead of while * De-indent a large block which was unnecessarily indented * Use camel case for variable names, per the style guide Change-Id: I0b2c37fdcab7f7238db0393085c43297e7a03ab2
* Balancer: remove redundant assignmentTim Starling2016-08-041-1/+0
| | | | Change-Id: I6c22d6227e43a2c5be454955eff6b053a94a1657
* Balancer: consistent single-line comment styleTim Starling2016-08-041-133/+134
| | | | | | Also break a line that was over 100 bytes Change-Id: I875d572d4147f2438526a49ca6cb5b73907bdc9b
* Support <textarea> tags in Balancer.C. Scott Ananian2016-07-211-10/+73
| | | | Change-Id: I63c2fd1c343362e49cf3b5a258fc98489744ad68
* Support tokenizing simple HTML comments in the Balancer.C. Scott Ananian2016-07-211-5/+89
| | | | Change-Id: Ib780595b13b7145e99867d16e3c225e6b2b91884
* Support <form> tags in Balancer.C. Scott Ananian2016-07-211-7/+65
| | | | Change-Id: I893fc231fea71f58449ed426d64ac99fdcb31d9e
* Support <select> tags in Balancer.C. Scott Ananian2016-07-211-13/+120
| | | | Change-Id: Ibc346624a9d035c98a29132a541e7ed6d82b364e
* Minor bug fixes to Balancer.C. Scott Ananian2016-07-181-2/+4
| | | | | | | | | | | | | | This is a follow-up to the refactor done in 5726c9ceb0644af360d37b86351b97ddfcbee20c which prevents a crash when the first entry in the stack happens to be a BalanceMarker (and thus doesn't have a `$localName` property). It also fixes an unrelated issue where unpaired close-heading tags (like `</h3>`) get entity-escaped instead of ignored. Test cases exposing these bugs are added in Ie854cf99f7e72bcca1bb8565ace558a43dcb6379. Change-Id: Ia9a1d435be1be10512071f5ff626b68742863483
* Hide marked empty elements by default (stage 1)Tim Starling2016-07-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | We originally imagined rolling out the display of empty elements simultaneously with the Html5Depurate, but now we have added support for marking empty elements to Html5Depurate and plan on having some sort of longer migration period. So, move the relevant CSS to content.css, and remove the concept of CSS dependant on tidy driver. Add a body class which will allow the effect to be toggled in a gadget or extension. Actual toggling in the CSS will be in the stage 2 patch, to be deployed after the varnish cache and parser cache have expired. I originally imagined that there would be a gadget that overrides the rule with an !important selector, but that method does not allow you to recover the original display property, which is often overridden by the style attribute or site CSS to be "inline". Also, in RaggettWrapper, switch to the new class mw-empty-elt, following Html5Depurate, instead of mw-empty-li. The old class will be removed in the stage 2 patch. Change-Id: Ic0f432c43a006629ca5a1a7c2dda3552ceb4dc4f
* Balancer: Inline BalancerStack::length()Tim Starling2016-07-121-5/+6
| | | | | | Provides 1% reduction in benchmark time Change-Id: Ie8ff66a836cd137234828effcce9547e2cb3cd58
* Balancer: remove all Assert::parameterType() callsTim Starling2016-07-121-23/+7
| | | | | | | Profiling shows they are especially expensive. 29% reduction in benchmark time. Change-Id: I5206b05007c7e1d6552974bcd7c57aa03eea231d
* Balancer: Introduce BalanceElement::isHtmlNamed()Tim Starling2016-07-121-17/+27
| | | | | | | An optimised special case of BalanceElement::isA, reducing benchmark time by 4% Change-Id: I1204de9454eb7b8f9f3a5ed137218c3293f9ab27
* Balancer: cache BalanceStack::currentNode()Tim Starling2016-07-121-31/+43
| | | | | | | Make currentNode into a public property instead of a function call, for a performance improvement of about 4%. Change-Id: I89861557531c55a63abef52c0acabbfb5c155bda
* Some Balancer improvements for performance and compatibilityTim Starling2016-07-121-156/+377
| | | | | | | | | | | | | | | | | | * Use a doubly-linked list for the AFE list, instead of an array, allowing efficient insertion and removal from the middle, and trivial O(1) lookup of existing elements. * Use a hashtable of singly-linked lists for storing Noah's Ark buckets, instead of iterating through the entire AFE list on every push. * Store attributes in an array instead of serializing them in the tokenizer. This allows us to avoid sorting them in the output. For the Noah's Ark clause, the array is copied and then sorted on demand. * XHTML-style serialization with self-closing tags. * Clear the AFE list in stopParsing(), otherwise all the BalanceElement objects are kept alive until after serialization, thus using O(N^2) memory (in stack depth N) since the full serialization is stored at each stack level. Change-Id: I517129c0658f03eb2ddee61fdf33ffe6fbd48509
* Hook up Balancer as a Tidy implementation.C. Scott Ananian2016-07-122-5/+130
| | | | | | | This is an HTML5-compliant parse/serialize tidy implementation, with well-delineated hacks to support the <p>-wrapping done by legacy tidy. Change-Id: I4fd433fd6f1847061b0bf4b3e249c918720d4fae
* HTML5 BalancerC. Scott Ananian2016-07-121-0/+2886
| | | | | | | | | | | | | | | | | | | | | This adds an implementation of the HTML5 Tree Builder algorithm to PHP, along with test cases from the tree builder derived from the html5lib-tests package on github. The test cases were preprocessed into JSON for the `domino` HTML5 parser, and we're using the JSON form of the tests. The implementation follows both the language of the HTML5 specification and the implementation in `domino` very closely, easing updates if the specification changes. This code is used in follow-on commits to support an HTML5-based "tidy" for mediawiki and the `{{#balance}}` parser function, which ensures that a template expands to properly-balanced HTML, with all tags closed and nothing left on the HTML active formatting elements list. See: https://github.com/fgnass/domino Change-Id: I6f4d20a43510dd819776bb333b639315b19d150d
* Fix undefined classesErik Bernhardson2016-06-302-2/+2
| | | | | | | Applying static analysis to mediawiki core found a short list of classes that were undefined. Fix those up. Change-Id: Ib7f9dbd847ada287b35afb799782fc04a3b39ce4
* Use english messages for background use of Status::getWikiTextumherirrender2016-04-121-2/+3
| | | | | | | | Status::getWikiText is used for internal logging, api error messages and maintenance scripts. All this places are usually in english, so pass an english language to getWikiText. Change-Id: I3010fca8eb5740a3a851c55a8b12e171714c78f7
* Tidy: <source> and <track> are empty elementsDerk-Jan Hartman2016-02-211-2/+2
| | | | | | | | Seems these got accidently added as inline items, even though they should be and are output as empty elements. This should correct that. Bug: T122787 Change-Id: I6e75529c9d349050479c1b7ad758320d1e948e78
* Convert all array() syntax to []Kunal Mehta2016-02-173-20/+20
| | | | | | | | | | Per wikitech-l consensus: https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html Notes: * Disabled CallTimePassByReference due to false positives (T127163) Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
* Merge "Fix various mistakes in PHPDoc comments"jenkins-bot2015-12-091-2/+2
|\
| * Fix various mistakes in PHPDoc commentsThiemo Mättig2015-12-091-2/+2
| | | | | | | | Change-Id: I434207f61e0663f2d2c9a076296c2e0d04a3fafb
* | Client-side migration for empty li preservationTim Starling2015-10-281-0/+8
|/ | | | | | | | | | | | | | | | | | | | It is desirable in terms of user-friendly syntax to display an empty list item if the user adds one to the source. However, we suspect that this change will break the rendering of existing templates. So, preserve the empty <li> element, but style it with display:none so that there is no user-visible change. Changes can then be observed with a user script, then eventually the CSS can be removed so that the desired behaviour will be user visible. This is imagined as a staged deployment of T89331, i.e. it is better to resolve differences with Html5Depurate one at a time instead of deploying it all at once. The CSS module is specified in parser/MWTidy.php since the tidy driver hierarchy is not meant to be so closely tied to the MW environment. Bug: T49673 Change-Id: Ifb44b782c617240e3de73dcdf76c8737c7307d94
* Fix issues identified by SpaceBeforeSingleLineComment sniffVivek Ghaisas2015-09-261-1/+1
| | | | Change-Id: I048ccb1fa260e4b7152ca5f09b053defdd72d8f9
* Re-enable PSR2.Methods.MethodDeclaration.AbstractAfterVisibilityReedy2015-09-261-1/+1
| | | | Change-Id: I50a987edf03cb19bfd707cd00c143c3665eba94f
* Re-enable PSR2.Namespaces.NamespaceDeclaration.BlankLineAfterReedy2015-09-261-0/+1
| | | | Change-Id: I39f71dde31f3ec18ab06904692f6a4ffd454d4d1