aboutsummaryrefslogtreecommitdiffstats
path: root/includes/parser
Commit message (Collapse)AuthorAgeFilesLines
* Bump wikimedia/parsoid to 0.21.0-a4Arlo Breault2024-11-111-1/+1
| | | | | | | | | | In the new release of parsoid the PageBundle::$html field is given a non-nullable type hint, which causes phan errors unless we update the tests. Bug: T379319 Depends-On: Ifb33cf12adda79c6b271ec0467975e2823f1c703 Change-Id: I753bbbfaf99fb486384b0fa97de71159abb504b3
* ParserCache: Allow for gradual roll out of the new key schemeAmir Sarabadani2024-11-111-0/+16
| | | | | | | | | | | | | This makes sure all entries for the same page end up in the same database table in the same cluster so depool/crash of a parsercache host wouldn't have out of proportion effects on the cache overall. But if we just change the key scheme, every key will be displaced and everything will go down. So we need to introduce a temporary config variable to gradually roll out the change. Bug: T373037 Change-Id: Iae9b8dd5dd65c6d7c8d3b6f786a110d72f0b959e
* Drop empty idsArlo Breault2024-11-011-0/+3
| | | | | | | | | | | Empty ids aren't valid identifiers. The spec says they must contain at least one character, https://html.spec.whatwg.org/multipage/dom.html#global-attributes:the-id-attribute-2 Test was introduced in If95fd9410f8d2e1ed403ea063e09670a7f71dcce Depends-On: Iec3c919ed1ea51acef9efabe979bd8d0feaf651a Change-Id: I3c547f5524530e976eb7aa960751265c8383f7b4
* Remove ParsoidOutputAccessC. Scott Ananian2024-10-291-269/+0
| | | | | | This class and all its methods were deprecated in MW 1.43. Change-Id: I514714159cb4a07e157e7bf012327e8bff184d7f
* Remove Message::objectParams() and related codeBartosz Dziewoński2024-10-271-3/+1
| | | | | | | Deprecated in I492edabb7ea1d75774b45eb9fd18261b39963f9f. Bug: T278482 Change-Id: Ie9350ed0d7b2604fb4d2f440dee66964fe198c0e
* Merge "Replace uses of deprecated MediaWiki\Message\Converter"jenkins-bot2024-10-231-4/+2
|\
| * Replace uses of deprecated MediaWiki\Message\ConverterBartosz Dziewoński2024-10-231-4/+2
| | | | | | | | | | | | | | | | | | The converter is no longer needed now that Message and MessageValue use the same internal format for the message parameters. Bug: T358779 Depends-On: I625a48a6ecd3fad5c2ed76b23343a0fef91e1b83 Change-Id: I41392aca4ae6b40f3476397d7ca37ba6cadb2ae4
* | Merge "ParserOutput::getExternalLinks(): Deprecate use of the internal array ↵jenkins-bot2024-10-221-0/+6
|\ \ | | | | | | | | | reference"
| * | ParserOutput::getExternalLinks(): Deprecate use of the internal array referenceC. Scott Ananian2024-10-211-0/+6
| |/ | | | | | | | | | | | | | | | | | | | | In a future release this will return an array, not a reference to the internal array, to maintain abstraction and allow for representation changes internal to ParserOutput. This patch just add deprecation notices to the class and to the release notes. Change-Id: Ie3a3f98402c5a5a3a92326d7736c0df874829a6b
* | Merge "ParserOutput: Introduce ParserOutput::getLinkList()"jenkins-bot2024-10-212-0/+207
|\ \
| * | ParserOutput: Introduce ParserOutput::getLinkList()C. Scott Ananian2024-10-182-0/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This deprecates a number of methods which returned arrays by reference and exposed internal representation details of the ParserOutput. It also regularizes the return values to return consistent LinkTarget values, working around the wide variety of different internal storage formats used for links. In the future, once these methods which expose the internal representation are removed, we can simplify our internal storage as well. But for the moment we add the new getter without changing the internal representation. Note that by returning TitleValue objects this new interface also provides a means to fix the issue identified in T204792 where interwiki and namespace prefixes were getting confused. A TitleValue properly distinguishes between these -- although the callers will still have to be careful to use it as a TitleValue and not attempt to reparse it. These methods also correctly handle fragments, which are present for the language link type but stripped for the other linkt types. Bug: T204792 Change-Id: I48a2077b9645124f83082afd953d6bf7a861270b
* | | Merge "ParserOutput::runPipelineInternal: pass ParserOptions if provided"jenkins-bot2024-10-211-1/+1
|\ \ \
| * | | ParserOutput::runPipelineInternal: pass ParserOptions if providedC. Scott Ananian2024-10-211-1/+1
| | |/ | |/| | | | | | | | | | | | | Pipeline passes don't yet depend on ParserOptions, but they will. Change-Id: Ib15134a598a7e783c69c8e19bb29b53da6c4be55
* | | Merge "Deprecate ::setMetrics() calls with StatsdDataFactoryInterface"jenkins-bot2024-10-211-2/+6
|\ \ \ | |/ / |/| |
| * | Deprecate ::setMetrics() calls with StatsdDataFactoryInterfaceC. Scott Ananian2024-10-181-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | HtmlInputTransformHelper::setMetrics() and HtmlToContentTransform::setMetrics() take a StatsFactory now; deprecate passing a StatsdDataFactoryInterface. Depends-On: I0d8eb6cacd761fa4959419b10d59046e61c714ff Change-Id: I2374731f6d37a191fc4a865d2665f2ca18182db1
* | | Merge "Parsoid: SiteConfig::prefixedStatsFactory() can never return null"jenkins-bot2024-10-211-2/+2
|\| |
| * | Parsoid: SiteConfig::prefixedStatsFactory() can never return nullC. Scott Ananian2024-10-181-2/+2
| | | | | | | | | | | | | | | | | | | | | SiteConfig::$statsFactory is non-nullable, and StatsFactory::withComponent() never returns null. Change-Id: Ib14a1ee44b81476447717bc6aa00b54de1dca995
* | | Merge "parser: Increment expensive function count for special page transclusion"jenkins-bot2024-10-181-21/+23
|\ \ \
| * | | parser: Increment expensive function count for special page transclusionUmherirrender2024-10-181-21/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Transclude a special page can result in extra database queries, track it as expensive operation to avoid to many usages on one page. When the expensive function count is exceeded, the parser fallbacks to display the transclusion as normal wikilink. Change-Id: I86c9cf1fdd0833012ddbf51184080e3135eb83ec
* | | | Merge "ParsoidParser: add `wiki` as a label to parse metrics"jenkins-bot2024-10-181-4/+8
|\ \ \ \ | |_|_|/ |/| | |
| * | | ParsoidParser: add `wiki` as a label to parse metricsC. Scott Ananian2024-10-171-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactored slightly to use the new MetricTrait::setLabels() method as well. Change-Id: I4203c68d221630bc945a616544f80b05e40a1dad
* | | | Merge "Slightly simplify SiteConfig metrics implementation & improve doc"jenkins-bot2024-10-181-11/+7
|\| | |
| * | | Slightly simplify SiteConfig metrics implementation & improve docC. Scott Ananian2024-10-171-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Document that ::observeTiming takes an argument *in milliseconds* (not seconds). Use the new Metric::setLabels() method to simplify the implementation a bit as well. Change-Id: I374ff380466cfc5c12abb24793e8a4ed195db382
* | | | Add comment to ParserOutput::setIndexPolicy()Lucas Werkmeister2024-10-181-0/+13
| |/ / |/| | | | | | | | Change-Id: I01d03aa6204a13a92bb8bc00364c822c27aa60b9
* | | ParserOutput::addLanguageLink: Avoid a full Title parseC. Scott Ananian2024-10-171-9/+11
| | | | | | | | | | | | | | | Bug: T296019 Change-Id: I8a8d499a6a6646bc86a4be7e843430eecd08d0a4
* | | Use OutputPage::$metadata to store the 'prevent clickjacking' flagC. Scott Ananian2024-10-171-1/+6
| | | | | | | | | | | | | | | | | | Bug: T301020 Depends-On: I885f778eef92fa7d2b7d6a2c2997db6a8b0142e5 Change-Id: I3bfd47b078a5b84a88fffc04b48abe4c0023370f
* | | Merge "Use statslib for metrics emitted by HtmlInputTransformHelper, ↵jenkins-bot2024-10-171-10/+30
|\ \ \ | |/ / |/| / | |/ HtmlToContentTransform"
| * Use statslib for metrics emitted by HtmlInputTransformHelper, ↵Yiannis Giannelos2024-10-171-10/+30
| | | | | | | | | | | | | | HtmlToContentTransform Bug: T359475 Change-Id: I7d4ca748c106dfd560dae31294decfb2b181e2db
* | Use explicit nullable type on parameter argumentsUmherirrender2024-10-1611-15/+15
|/ | | | | | | | | | | Implicitly marking parameter $... as nullable is deprecated in php8.4, the explicit nullable type must be used instead Created with autofix from Ide15839e98a6229c22584d1c1c88c690982e1d7a Break one long line in SpecialPage.php Bug: T376276 Change-Id: I807257b2ba1ab2744ab74d9572c9c3d3ac2a968e
* Namespace all remaining classes in includes/parserJames D. Forrester2024-10-1547-56/+195
| | | | | Bug: T353458 Change-Id: If02cc9b1ff78e26c1cf8c91ee4695845eb133829
* Merge "ParsoidParser: pass render reason to Parsoid; fix case of 'sampleStats'"jenkins-bot2024-10-121-1/+2
|\
| * ParsoidParser: pass render reason to Parsoid; fix case of 'sampleStats'C. Scott Ananian2024-09-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Every other option passed to parsoid (except `body_only`) is in camelCase, so make 'sampleStats' into a camel as well. Pass the render reason to Parsoid so that parsoid-specific parse stats can be correlated with stats coming from the ParserOutputAccess. Used in I88ba26fefd9d69ad3e2354d1e235b1e42d1914a0 but does not depend on that patch. Change-Id: I2e5c897c55e41224567ed94bbf903c8fff96e841
* | Merge "Add static return type for `ParserOutput::getExternalLinks`"jenkins-bot2024-10-101-2/+2
|\ \
| * | Add static return type for `ParserOutput::getExternalLinks`Arthur Taylor2024-10-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PHPUnit tests that mock the ParserOutput object are unable to correctly infer that the mock should return an empty array rather than null for `getExternalLinks`. This is currently causing test failures in SpamBlacklist in CI. Add the return type definition to the function field definition so that PHPUnit has a better chance at doing the right thing. Note that `getExternalLinks` returns `$this->mExternalLinks` by reference; if there’s some existing code which reassigns a non-array value to that reference (and, consequently, to `$this->mExternalLinks`, such code will start to throw TypeErrors during the assignment. Bug: T376633 Change-Id: I246d5541200c9d0c405f30ea9de091ff9c0e759c
* | | Merge "Remove meaningless @var documentation from constants"jenkins-bot2024-10-091-1/+0
|\ \ \
| * | | Remove meaningless @var documentation from constantsthiemowmde2024-10-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A constant is not a variable. The type is hard-coded via the value and can never change. While the extra @var probably doesn't hurt much, it's redundant and error-prone and can't provide any additional information. Change-Id: Iee1f36a1905d9b9c6b26d0684b7848571f0c1733
* | | | ParsoidParser: ensure magic variable expansion uses pageLanguageOverrideC. Scott Ananian2024-10-091-0/+1
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds tests for the caching fix in Ie76020dc4fa3545f827e1674051530b479f01f31, but these tests also revealed that the recursive invocation of the legacy parser to expand magic variables like {{PAGELANGUAGE}} wasn't using the pageLanguageOverride, aka ParserOptions::getTargetLanguage(). The page language override is used when parsing new context which doesn't currently exist in the database and therefore doesn't have a page language set by its title (which doesn't yet exist). Bug: T376783 Follows-Up: Ie76020dc4fa3545f827e1674051530b479f01f31 Change-Id: If6fe7cf00be6e78ef46181b17f01138383e95e46
* | | Merge "ParserOutput::setPageProperty(): emit deprecation warnings for ↵jenkins-bot2024-10-081-2/+9
|\ \ \ | |/ / |/| | | | | non-strings"
| * | ParserOutput::setPageProperty(): emit deprecation warnings for non-stringsC. Scott Ananian2024-10-041-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was deprecated in 1.42 but did not previously emit deprecation warnings. Depends-On: I072b111b047cfe13e32a822678d68165d1c76f84 Depends-On: I2734383207b92f71bffc66ba2392a592a1df0954 Depends-On: I79bb5030c13e83f664da1635254f4bc171ed4f3e Depends-On: If64a5239a40953f244657e60f95b2e938abfe447 Change-Id: Ifefd3dab43247d988b7c7ff7874c05c90fc8ce1f
* | | Merge "ParserOutput: ensure all created ParserOutputs have a "start of ↵jenkins-bot2024-10-071-2/+20
|\ \ \ | | | | | | | | | | | | parse" time set"
| * | | ParserOutput: ensure all created ParserOutputs have a "start of parse" time setC. Scott Ananian2024-10-041-2/+20
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | *Most* implementations of ContentHandler::fillParserOutput() ensure that the returned ParserOutput has had ParserOutput::resetParseStartTime() called on it at an appropriate time -- but not *all*. This is a belt-and-suspenders fix that ensures that every code path which creates a ParserOutput has *some* "start time" defined. This could be misleading if the parsing is done first and the parser output is created at the very end of the parse, but in all the code that I've looked at the ParserOutput is the first thing created and so this default should be reasonable. While we're at it, remove the parseStartTime from the serialized form of the ParserOutput, because it is useless after the object is unserialized. Bug: T376433 Change-Id: I3bdf3996401a7d5ac4d8e1e5e6afb7ca410cbe6c
* / / Provide a prefixed StatsFactory in parsoid configYiannis Giannelos2024-10-041-4/+11
|/ / | | | | | | Change-Id: Ic3fc353b030a292952091813c9847cd697b25444
* | Switch over a bunch of class_alias uses to actualsJames D. Forrester2024-10-032-0/+3
| | | | | | | | Change-Id: Id175a83e71cc910eaee5d5890a9106872a3ca3b8
* | Merge "Add namespace to remaining parts of Wikimedia\Mime and Wikimedia\Stats"jenkins-bot2024-10-031-1/+1
|\ \
| * | Add namespace to remaining parts of Wikimedia\Mime and Wikimedia\StatsJames D. Forrester2024-09-271-1/+1
| | | | | | | | | | | | | | | Bug: T353458 Change-Id: If0137003ab625017d322d57870448a02569668c3
* | | Merge "Add namespace to remaining parts of Wikimedia\ObjectCache"jenkins-bot2024-10-037-3/+7
|\| |
| * | Add namespace to remaining parts of Wikimedia\ObjectCacheJames D. Forrester2024-09-277-3/+7
| | | | | | | | | | | | | | | Bug: T353458 Change-Id: I3b736346550953e3b2977c14dc3eb10edc07cf97
* | | Merge "Deprecate ParserOutput::setLanguageLinks(null)"jenkins-bot2024-10-021-3/+12
|\ \ \
| * | | Deprecate ParserOutput::setLanguageLinks(null)C. Scott Ananian2024-10-021-3/+12
| | |/ | |/| | | | | | | | | | | | | Bug: T376323 Follows-Up: I82a05a51d94782ebb9fa87ff889ca0f633b3e15c Change-Id: I0952659ab245326e9e8352170fb0a629ec109e72
* | | Merge "Allow localized gallery widths; avoid spurious "double px" tracking ↵jenkins-bot2024-10-021-2/+13
|\ \ \ | |/ / |/| | | | | category"