aboutsummaryrefslogtreecommitdiffstats
path: root/includes/OutputTransform
Commit message (Collapse)AuthorAgeFilesLines
* Namespace all remaining files in includes/skinJames D. Forrester2025-03-252-2/+2
| | | | | Bug: T353458 Change-Id: I3e829e35c93bcaae75e401b1801bddf93c0b416c
* Merge "Re-apply "Use Remex for DeduplicateStyles transform""jenkins-bot2025-02-181-15/+20
|\
| * Re-apply "Use Remex for DeduplicateStyles transform"Bartosz Dziewoński2025-01-101-15/+20
| | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 7f63d5250e8db443d8fce3016abd9757521b590d, re-applying commit 82da9cf14be08e9458f58fa96be51966a2fe7cb1. It can be re-applied safely after T354361 was fixed. Most of the incidental changes from the original patch are no longer needed, as they were made unnecessary by other work, or were applied in I4cb2f29cf890af90f295624c586d9e1eb1939b95. Change-Id: I1ff9a7c94244bffffe5574c0b99379ed1121a86d
* | Fix ContentDOMTransformStageTest after PageBundle refactoringIsabelle Hurbain-Palatin2025-02-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Since Ibffb2daaf817ce41102211bb9668b29a6e59c0c1, the document creation passes 'markNew' as a parameter, which allows non-Parsoid output to round-trip without data-parsoid or ids getting added to them. This patch adjusts the ContentTransformStageTest to take this into account, and sets markNew as true explicitly so that we do not depend on Parsoid vendor patch. Change-Id: I13bd1316ff22aba5672e500c3b07149d17843811
* | Provide the page title to localization on message parsingIsabelle Hurbain-Palatin2025-02-072-6/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Message parsing expects to have a page title available. The canonical way to do this seems to use the request context to create the message to be parsed, but this does not seem strictly necessary: Message::setContext only sets up the title and the user language callback, which would end up handled in the same way if we were passing RequestContext::getMain() here, which we have tried to avoid relying on so far in this part of the code. Hence, we rely on the page DBKey set by Parsoid to re-create a Title and pass it to the Message as PageReference, so that it is available when parsing the message. Bug: T380045 Change-Id: I587e64bed068a33fec27a91d303fe0a8cd585317
* | Also omit heading wrapper if id is reused from sourceArlo Breault2025-01-311-0/+8
| | | | | | | | | | | | Bug: T373400 Follows-Up: I0a42ac115fb2b96cbe56747d257b599d576179c0 Change-Id: Ief8094ce4b52e7693eca67e87d56f2aeba49f615
* | Only omit heading wrapper if it has attributesArlo Breault2025-01-231-7/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | $section->fromTitle being null is not a strong enough signal that the section is from a literal heading tag. Parsoid will set fromTitle to null for compound templates and various other cases WrapSectionsState. The conclusion of T353489 was to only omit the wrapper if the heading had attributes so we should match that here to avoid unnecessary Parsoid read view diffs. Bug: T373400 Change-Id: I0a42ac115fb2b96cbe56747d257b599d576179c0
* | Suppress error log for empty headingsArlo Breault2025-01-211-0/+7
|/ | | | | | | | | | | | | | | | | | | | | | | It's a common enough occurrence that heading content will normalize to an empty string, which is set as the heading id when storing section metadata. However, Parsoid will determine that to be an invalid id when storing node data and reassign it a Parsoid generated id. Thus, section mapping will never select the heading node. There's no need to generate an error log for this case. Parsoid could eagerly assign a generated id when it runs into this situation but that wouldn't match the legacy parser. T368722 suggests either considering empty headings as invalid and rendering the markup syntax or emitting a lint to have the uses cleaned up. There's a test demonstrating empty headings section wrapping in I6aea9351786264bd79dddd7f5b234c923ef172a6 Bug: T375002 Bug: T368722 Change-Id: Ibf678b343dc6e7db6eee3ffd687ca2cbd72b7250
* Merge "OutputTransform: Fix double IDs on headings"jenkins-bot2024-12-161-3/+7
|\
| * OutputTransform: Fix double IDs on headingsArlo Breault2024-12-131-3/+7
| | | | | | | | | | | | | | | | Based on Ifeaaba1d0215e6f67f889a09c02879cc9079aa19 Bug: T366083 Co-Authored-by: Bartosz Dziewoński <dziewonski@fastmail.fm> Change-Id: I2712e0fa9272106e8cd686980f847ee7f6385b6f
* | Merge "Remove unusual message keys for parser limit report"jenkins-bot2024-12-131-1/+1
|\ \
| * | Remove unusual message keys for parser limit reportBartosz Dziewoński2024-11-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As far as I can tell, no one ever used the "$key-value-text" or "$key-value-html" keys, only "$key-value": https://codesearch.wmcloud.org/search/?q=-value-(html|text)%22%3A&files=en.json https://global-search.toolforge.org/?q=.*&regex=1&namespaces=8&title=.*-value-%28html|text%29 They were added in 2013 (2b20038ce77ce8b939113a3edb7d25127c238d4e). Change-Id: I175e834e2b425f0ba1b8650ae44dbf65fb23fe6e
* | | Merge "editpage: More consistently exclude unintended limit report entries"jenkins-bot2024-12-111-1/+1
|\| | | |/ |/|
| * editpage: More consistently exclude unintended limit report entriesBartosz Dziewoński2024-11-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some entries in the limit report are not supposed to appear in this table, because they require parameters other than the one or two numeric parameters supported by this code. RenderDebugInfo already has a list of them and special handling. However, excluding them relied on this check within the foreach loop: if ( !$keyMsg->isDisabled() && !$valueMsg->isDisabled() ) ... which will pass in qqx mode, or if the message key is defined on-wiki. Instead, exclude them based on a list, like in RenderDebugInfo. Bug: T379971 Change-Id: Ie0ccb2d7ddab0ac4ba30f27c908059f23fb387a1
* | Merge "Do not pre-parse MessageValue arguments"jenkins-bot2024-11-151-27/+6
|\ \
| * | Do not pre-parse MessageValue argumentsIsabelle Hurbain-Palatin2024-11-151-27/+6
| |/ | | | | | | | | | | | | | | | | | | The previous version of this code was parsing MessageValue arguments and "assembling" the i18n message, which led to a double-parse of the message, which led to "dangerous" tags getting escaped. This patch fixes this behaviour. Bug: T380045 Change-Id: Ib08d2d44e4c84dadc50e1798afcbdc681cb459a3
* / Use new Parsoid DomPageBundle classC. Scott Ananian2024-11-141-50/+12
|/ | | | | | | | | | | | | | This avoids dependencies on the internals of Parsoid's load/storage mechanism for data attributes. Documents should also not be "prepared and loaded" when running parser tests, after d07ee3a694f3b3372aea690d9104d241082810fb in Parsoid. Depends-On: Ic3c09444cef51767629a9f7fac9e79351bb1fc48 Depends-On: I753bbbfaf99fb486384b0fa97de71159abb504b3 Depends-On: I07b8d6f6c3006d238093b756df418b645ebd532a Change-Id: I9e6b924d62ccc3312f5c70989477da1e2f21c86b
* Handle MessageValue as parameters of I18nInfoIsabelle Hurbain-Palatin2024-11-081-5/+30
| | | | | Bug: T372709 Change-Id: Ieed7b5a18f5223c7b8a2918df88790d4dc305f9d
* Use explicit nullable type on parameter argumentsUmherirrender2024-10-161-2/+2
| | | | | | | | | | | Implicitly marking parameter $... as nullable is deprecated in php8.4, the explicit nullable type must be used instead Created with autofix from Ide15839e98a6229c22584d1c1c88c690982e1d7a Break one long line in SpecialPage.php Bug: T376276 Change-Id: I807257b2ba1ab2744ab74d9572c9c3d3ac2a968e
* Namespace all remaining classes in includes/parserJames D. Forrester2024-10-1516-16/+16
| | | | | Bug: T353458 Change-Id: If02cc9b1ff78e26c1cf8c91ee4695845eb133829
* Avoid use of deprecated wfExpandUrl in ExtractBodyEbrahim Byagowi2024-09-092-6/+21
| | | | Change-Id: Ic68ecf6654c8e73a643adce2ef5dccb53b7a632a
* Introduce runOutputPipeline and clone by defaultIsabelle Hurbain-Palatin2024-09-061-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the third patch of a series of patches to remove ParserOutput::getText() calls from core. This series of patches should be functionally equivalent to I2b4bcddb234f10fd8592570cb0496adf3271328e. Here we temporarily introduce runOutputPipeline in ParserOutput. It creates and runs the pipeline with default options, and is called by getText. (This is not entirely truthful because we go through a runPipelineInternal transient method for null-argument-passing reasons, but let's not over-complicate this commit message.) getText is responsible for maintaining the current behaviour, that is "disallow the cloning of the ParserOutput and putting text back to as it was" to mitigate T353257. As we get rid of getText, this behaviour should be moved, if necessary, to the caller site. The new method is currently added to ParserOutput so that further refactorings are, for the moment, simpler. It will eventually be moved to another place within the Content framework. We also rename 'suppressClone' to 'allowClone' (which is actually its negation) to avoid multiple levels of negations that make the code confusing. Note that the default value of 'allowClone' is true, and is currently overriden in two places: getText and OutputPage::getParserOutputText (which calls the pipeline directly and not through ParserOutput). Bug: T293512 Bug: T371022 Change-Id: Ibf04af1079aaa1934dc78685b00e636ff4d38a9a
* Ensure that `isParsoidContent` is initialized in OutputTransformPipelineC. Scott Ananian2024-08-261-0/+8
| | | | | | | | | | | | | | The refactorings in I45951a49e57a8031887ee6e4546335141d231c18 replaced calls to ParserOutput::getText() with direct invocations of the pipeline, including in OutputPage::getParserOutputText(). However, the direct invocation skipped the implicit initialization of the options array previously done in ParserOutput::getText(). Ensure that the options array gets appropriate default values; in particular 'isParsoidContent' is expected to always be set. Bug: T293512 Bug: T373405 Change-Id: Ib8d540b4221f7c00f6047706c4e3bfd88a2cb8cc
* Merge "Make ContentDOMTransformStage not Parsoid specific"jenkins-bot2024-08-261-1/+29
|\
| * Make ContentDOMTransformStage not Parsoid specificArlo Breault2024-08-141-1/+29
| | | | | | | | | | | | | | | | | | Previously, it assumed Parsoid content and loaded/stored data attributes unconditionally. The result being that, if this stage was subclassed to be used an non-Parsoid pipeline, the dom would undesirably be dirtied with Parsoid ids or data-parsoid attributes. Change-Id: I2f1af43d9c39140ce215e2145e51cc3b02f68923
* | Move Language and friends into Language namespaceJames D. Forrester2024-08-102-2/+2
|/ | | | | Bug: T353458 Change-Id: Id3202c0c4f4a2043bf97b7caee081acab684155c
* HandleParsoidSectionLinks: also run this pass if COLLAPSABLE_SECTIONSC. Scott Ananian2024-07-301-15/+14
| | | | | Bug: T371336 Change-Id: Ieccddc229c39f65de6f2bba6364f933592686ade
* Add OutputPipelineStages from extensionsArlo Breault2024-07-251-2/+16
| | | | | | | | | | | | | Adds an experimental configuration to allow extensions to define OutputPipelineStages to include in the DefaultOutputPipeline. There are a lot of open questions about this api, like ordering of execution, but adding it @experimental will help surface the requirements. Bug: T370541 Needed-By: I6dc92af0611c680b6e55605a7c9ff8a3fc1dfa26 Change-Id: I64baea40a1687c7a06fbcda9efe9f9a159b0ae8d
* Update defaults for AddWrapperDivClassIsabelle Hurbain-Palatin2024-07-191-2/+2
| | | | | | | | | The rest of the pipeline is trying to have the same defaults in the pipeline built for (what is still) getText than the default options of the pipeline stages. This is currently not the case for AddWrapperDivClass; this patch fixes that. Change-Id: I791d679a7b7309dfeb90c9736ef0e4848b038e08
* OutputTransform: Handle skipped tests in HydrateHeaderPlaceholders.phpIsabelle Hurbain-Palatin2024-07-111-0/+2
| | | | | | | | A comment in I8744382dd24b28c623d0dc6569f800fb5489e6c1 mentions that two tests are skipped. This patch fixes one of these skips, and makes the other one more explicit. Change-Id: Id5680fc163a9bfacfe797af619e40032cdee38b1
* Merge "Fix bundle reinjection of ContentDOMTransformStage"jenkins-bot2024-06-241-12/+24
|\
| * Fix bundle reinjection of ContentDOMTransformStageIsabelle Hurbain-Palatin2024-06-111-12/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When re-injecting the page bundle to the newly created ParserOutput, we were omitting the version, headers and contentmodel data of said page bundle reinjection. This patch fixes that. Note that it will silence places where getText should typically not be called, but that's a larger problem that needs to be addressed on the calling places, and doesn't detract from the fact that we needed to fix this loss of information on the bundle anyway. Bug: T365433 Depends-On: I2a87a8233b9e42cbafdba63bdf513abe00d826ce Change-Id: I7f57ddc76b9d3b24226f8b5da1b70bc83134856f
* | Use namespaced classes (3)Umherirrender2024-06-161-2/+2
|/ | | | | | | Changes to the use statements done automatically via script Addition of missing use statement done manually Change-Id: Ia35b2d3105880631dd26ec974068b000ac7f4b6b
* Use $stage::CONSTRUCTOR_OPTIONS in DefaultOutputPipelineFactoryC. Scott Ananian2024-06-1012-29/+42
| | | | | | | | | | | | | | Rather than have DefaultOutputPipelineFactory::CONSTRUCTOR_OPTIONS be a union of all the options needed by all the stages, allow each stage to define its own CONSTRUCTOR_OPTIONS and pass a Config object to the DefaultOutputPipelineFactory service. In the process, move the $options and $logger properties into the abstract superclass, since they are passed to every stage. Bug: T363764 Followup-To: I64aeb81b395ba84e1d839dfbd31decf16c337cd0 Change-Id: I7d386b22c7d8e99b6dfe4cf798069914ac9af373
* Refactor DI in OutputTransform stagesArlo Breault2024-06-109-48/+113
| | | | | Bug: T363764 Change-Id: I64aeb81b395ba84e1d839dfbd31decf16c337cd0
* Inject MobileContext in DefaultOutputPipelineFactoryArlo Breault2024-06-102-9/+16
| | | | Change-Id: I613893fa236be956a4850a52a03a40e620c7ce64
* Get mobile url for Parsoid's baseHrefArlo Breault2024-06-101-0/+10
| | | | | | | | | | | | | | | | | | | | The legacy parser does not run ExpandToAbsoluteUrls unless it's doing ?action=render. ExpandToAbsoluteUrls doesn't work for mobile urls, which seems to be captured in T171398 / T195494. Since relative urls aren't resolved in legacy output though, the browser uses the mobile url. Parsoid, however, does ExtractBody which has its own expandRelativeAttrs pass, which resolves relative urls against the baseHref in the document head. The baseHref is taken from MainConfigNames::Server, which presumably suffers the same issue as the above task. But also maybe MFE is transforming cached html, where the non-mobile baseHref is desirable. In any case, to produce the same urls as the legacy parser, transform the baseHref to one that conforms with mobile url template. Bug: T365483 Change-Id: I32800f5ea848d70b6ef67ec9102c432b9626afcb
* Merge "Alias Parsoid DOM nodes to PHP DOM implementation"jenkins-bot2024-05-222-11/+15
|\
| * Alias Parsoid DOM nodes to PHP DOM implementationC. Scott Ananian2024-05-222-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Parsoid abstracts the specific DOM implementation it is using, in practice (currently) using subclasses of the built-in \DOMDocument classes using the \DOMDocument::registerNodeClass() mechanism. Parsoid's own phan configuration uses stubs for its abstract DOM classes to encourage the use of "standard" DOM methods -- but core doesn't use Parsoid's phan configuration and doesn't really understand the way that ::registerNodeClass() works and so get confused by code such as: $el = $document->createElement('div'); In actual practice this is a Wikimedia\Parsoid\DOM\Document (a subclass of \DOMDocument) which creates a Wikimedia\Parsoid\DOM\Element (a subclass of \DOMElement) via the ::registerNodeClass() mechanism, but phan sees only the base \DOMDocument::createElement() signature and assumes this creates a \DOMElement *not* a Wikimedia\Parsoid\DOM\Element. If you do "element-y" things on this, phan has no complaints, but if you pass this back to a Parsoid method which expects the abstract Wikimedia\Parsoid\DOM\Element type then phan (spuriously) complains. This type error can be hard to understand. Workaround this issue by simply aliasing Parsoid's abstract DOM types to the built-in \DOMDocument etc types. The alternative would be to use Parsoid's stubs, but it seems cleaner (for now) to avoid reaching into vendor/wikimedia/parsoid/.phan/stubs to get them. Change-Id: I90b33c5d65bde1582be9a452a144808b6d53d914
* | Fix serialization errors in PageBundle extensiondataIsabelle Hurbain-Palatin2024-05-171-0/+17
|/ | | | | | | | | | | | | | | When going through a ContentDOMTransformStage, we try to move the PageBundle when transforming the document from and to DOM. In the current version of this code, this adds DataParsoid, a non-serializable class, to ExtensionData, which breaks on ParserCache storage in later steps. This patch is pretty hacky, but it transforms the PageBundle structure back to a stdClass so that it can be re-serialized before cache insertion. The added test fails without this patch. Hopefully we'll get rid of these hacks when using a HTMLHolder later. Bug: T365036 Change-Id: Icc74edd43ea5098faebc21a084b6d483d6ab99d1
* Merge "Add Parsoid HTML version to wrapper div"jenkins-bot2024-05-141-3/+8
|\
| * Add Parsoid HTML version to wrapper divC. Scott Ananian2024-05-061-3/+8
| | | | | | | | | | | | Followup-To: I941d31479eebb12ea1f4dcdb0a1737033ddc8ac1 Depends-On: I95be56e3662f9cffd1eb5c03bbc0379d4e0a9ee0 Change-Id: I4aaa4b9e800271c2bcfc2fd74f09853b31ee6859
* | Fix the loss of ParserOutput pointer in ContentDOMTransformStagesIsabelle Hurbain-Palatin2024-05-101-3/+1
| | | | | | | | | | | | | | | | | | | | | | When running a ContentDOMTransformStage, we effectively clone the input ParserOutput, which is in contradiction with the current expectations of the pipeline. This patch slightly modifies the logic by making it possible to apply a PageBundle data to an existing ParserOutput without the necessity to create a new one. Bug: T364597 Change-Id: I633fc33485f22cf645acd41650a6983df3b0a534
* | Merge "Localization output transform"jenkins-bot2024-05-062-0/+119
|\ \
| * | Localization output transformIsabelle Hurbain-Palatin2024-05-062-0/+119
| |/ | | | | | | | | | | | | | | This is an output transform to resolve the mw:I18n and mw:LocalizedAttrs to their localized forms. Bug: T358191 Change-Id: Id32bc05ff72eb2d9fba7f8c2f192c9f7812cbc70
* / Move section edit links outside headings (new heading HTML)Bartosz Dziewoński2024-05-062-18/+99
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Legacy parser can now output headings using a more accessible markup, which is also identical to the markup used by the Parsoid parser. Changes to client-side JS and CSS necessary to support the new markup have already been merged in earlier commits. includes/skins/Skin.php includes/ServiceWiring.php * Define a new skin option, 'supportsMwHeading', which can be used to toggle the new markup per-skin. * Update the built-in fallback skin to enable it. This affects the output in parser tests. docs/config-schema.yaml includes/config-schema.php includes/config-vars.php includes/MainConfigNames.php includes/MainConfigSchema.php * Add a new configuration setting, 'ParserEnableLegacyHeadingDOM', which can be used to toggle the new markup per-site. includes/OutputTransform/Stages/HandleSectionLinks.php * Output new heading HTML for skins that enabled the option. tests/* * Duplicate parser tests that cover heading generation to cover both new and old markup. Update other parser tests to use new markup. * Add some unit and integration tests for the behavior of the skin option and some parser tests for edge cases of the new markup. Bug: T13555 Change-Id: I1180169a8e83af834c2984ba16089e6277f2a8dd
* Merge "Add ParserOptions::setCollapsibleSections()"jenkins-bot2024-04-291-0/+13
|\
| * Add ParserOptions::setCollapsibleSections()C. Scott Ananian2024-04-291-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a non-default option that will add a <div> wrapper around section contents to allow client-side collapsing. This is intended for use by MobileFrontEnd, but could eventually be enabled for desktop read views as well. Since this parser option is in the "cache-varying options" set, any caller who sets this option will fork the cache for that page, which is reasonable as the parser options sets a ParserOutput property. In the future our caching strategy will get smarter and we'll add code which avoids the cache split and just transfers the appropriate values from ParserOptions to ParserOutput flags after the cached output is retrieved. Bug: T359001 Change-Id: Ie93959a056ed15a728404eb293e4bb6eeaeb15c0
* | [OutputTransform] Add data-mw-parsoid-version to wrapper divC. Scott Ananian2024-04-291-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding a data-mw-parsoid-version attribute to the wrapper div helps to unambiguously mark parsoid-generated output in a way which is compatible with CSS rules and client-side JavaScript. By embedding the current version of parsoid in the data attribute, sophisticated CSS rules can match against a specific version of Parsoid in order to facilitate proper behavior; for example: div[data-mw-parsoid-version^="0.20.0"] This could be useful in deployment scenarios where the parser cache might contain content generated by older or newer versions of Parsoid, for roll-forward or roll-back deployment scenarios, respectively. Bug: T363378 Change-Id: I941d31479eebb12ea1f4dcdb0a1737033ddc8ac1
* | ExtractBody: Use page title recorded in ParserOutputSubramanya Sastry2024-04-192-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * Followup to 9a466310 * I had previously added page title info to ParserOutput as part of 6e5413b1, but while working on 9a466310, we didn't realize that. * Removed urldecode(..) since output of Title::getPrefixedDBKey isn't urlencoded and urldecode converts "+" into " ". A new test ensures that edge case works properly. * Simplify testing + add additional test to ensure title normalization doesn't trip up the transform. Bug: T358242 Change-Id: I9a0cb00bdf9d104a4b327d72b1ec94cf509883a2