aboutsummaryrefslogtreecommitdiffstats
path: root/tests/phpunit/languages/classes
Commit message (Collapse)AuthorAgeFilesLines
* Move Language subclasses to includes/Timo Tijhof2021-08-0455-3466/+0
| | | | | | | | | | | | | | | | Depending on which namespace we want these classes to have after T166010 they could either stay in includes/languages/ (plural) in their own MediaWiki\Languages\-namespace dedicated to Language subclasses, or they could go in into a subdirectory like `includes/language/languages/` if we want to keep them in the same top-level namespace as other Language classes and services, but in a more nested namespace. For now, I've made the smaller change and kept the Language subclasses in their own directory directly under includes/, not nested further. Bug: T225756 Change-Id: I01015424707b442853879fd50c97f00215e5c2fa
* Fix a bunch of random typosDannyS7122021-06-291-1/+1
| | | | | | | | | | | | | | * yeild -> yield * paramter -> parameter * seperator -> separator * neccesary -> necessary * inital -> initial * intial -> initial * repsonse -> response * retreived -> retrieved Bug: T201491 Change-Id: I461941b027590997448f3bdd8a137a48bb338beb
* Add missing @param and @return to documentation in testsUmherirrender2021-01-221-1/+6
| | | | Change-Id: Ic663e81cca0bf007804a70772250914a85f1fef4
* Improve some function documentation in testsUmherirrender2021-01-141-2/+4
| | | | | | Also fix some whitespaces Change-Id: Ibed50a4f07442d3f299cf545c16f5dbb5f27a411
* Use Unicode minus in output of {{formatnum}}C. Scott Ananian2020-11-165-14/+14
| | | | | Bug: T10327 Change-Id: I4b315d439fef7d7cdf2fc5ae1904e0460a2a60e0
* Update formatNum implementation to match tr35 and latest CLDRSanthosh Thottingal2020-10-216-44/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Update digitGroupingPattern to match CLDR 31: New versions of CLDR has digit grouping pattern with decimal part. Update digitGroupingPattern values in Message classes with this improved pattern. Refer: http://unicode.org/reports/tr35/tr35-numbers.html * Refer the following chart for the decimal patterns. http://www.unicode.org/cldr/charts/31/by_type/numbers.number_formatting_patterns.html * Uses PHP NumberFormatter class for the commafy implementation, which is available in PHP 7. * Some tests need to update to match the TR 35 spec * The formatNum public method in Language.php is the preferred way to use this feature. It does separator transformation and digit transformation wherever applicable. * Renamed the second param name for formatNum from noCommafy to noSeparators * commafy method is deprecated and formatNum is preferred. Practically, we are not just adding comma, but seperators according to the language. Replaced some tests based on commafy methods with tests based on formatNum. Note: The corresponding js implementation is not changed in this commit. It would probably be a good idea to use globalize.js, which is also based on the CLDR patterns. Note: This patch preserves the existing off-by-one error in $minimumGroupingDigits; T262500 will eventually fix this. Bug: T167088 Co-Authored-By: C. Scott Ananian <cscott@cscott.net> Change-Id: Ic721b9a91e78e4ef07040339d1006b7a90a910c0
* Add accusative case to Russian language GRAMMARAmir Aharoni2020-10-101-0/+65
| | | | | Bug: T257500 Change-Id: I30a892a936c0ed9247bc6b63be747697cb9f3e26
* Adding default locative rule for UkrainianBase2020-06-121-0/+5
| | | | | | | | Prepending a default preposition to locative GRAMMAR forms in Ukrainian language Bug: T149550 Change-Id: I4649549dc3c722e53c7ea3accb6747df420e56f7
* build: Bump mediawiki-codesniffer to 31.0.0Daimona Eaytoy2020-05-301-0/+1
| | | | | | | | | Done with `composer fix` and suppressing the rest (i.e. sniffs for global variables, which for core should be suppressed anyway). Additionally, add `-p` to `phpcbf`, as otherwise it just seems stuck. Change-Id: Ide8d6cdd083655891b6d654e78440fbda81ab2bc
* Reduce usage of the Language classArtBaltai2020-03-032-2/+26
| | | | | | | reduce/deprecate visibility of some members of the Language class Bug: T243913 Change-Id: I6bad608455ceaa46f895f00dcc6380cec6d32680
* Remove $wgFixArabicUnicode and $wgFixMalayalamUnicodeDannyS7122020-02-212-10/+0
| | | | | | Bug: T241352 Bug: T241353 Change-Id: Idefd5624d761fad4a6c3cca950ce1038a4dec770
* Merge "languages: Add @group Language to all tests related to Language for ↵jenkins-bot2020-02-0340-27/+94
|\ | | | | | | easier navigation through tests"
| * languages: Add @group Language to all tests related to Language for easier ↵Peter Ovchyn2020-02-0340-27/+94
| | | | | | | | | | | | | | navigation through tests Bug: T226833, T243761 Change-Id: Ied7d4a1db661f5cfaefe6c392348ff56b1a5616c
* | languages: Move Converter and tests to respective filesPeter Ovchyn2020-02-032-128/+0
|/ | | | | Bug: T226833, T243760 Change-Id: I6fc7f267098d663fbefd0e78457726c343c9b3e4
* languages: Introduce LanguageConverterFactoryPeter Ovchyn2020-02-037-10/+2
| | | | | | | | | | | | | | | | | | | Done: * Replace LanguageConverter::newConverter by LanguageConverterFactory::getLanguageConverter * Remove LanguageConverter::newConverter from all subclasses * Add LanguageConverterFactory integration tests which covers all languages by their code. * Caching of LanguageConverters in factory * Make all tests running (hope that's would be enough) * Uncomment the deprecated functions. * Rename FakeConverter to TrivialLanguageConverter * Create ILanguageConverter to have shared ancestor * Make the LanguageConverter class abstract. * Create table with mapping between lang code and converter instead of using name convention * ILanguageConverter @internal * Clean up code Change-Id: I0e4d77de0f44e18c19956a1ffd69d30e63cf51bf Bug: T226833, T243332
* Coding style: Auto-fix MediaWiki.Usage.PHPUnit*James D. Forrester2020-01-101-1/+1
| | | | Change-Id: I86fc55a4fc8ceafe368692173211bbcd6d8581d7
* LanguageNlTest: Split test and data providerDannyS7122019-12-101-9/+15
| | | | | | Per inline @todo tag, split the test Change-Id: I38f01c2215b9ba09d6af117ae70b242de362b51f
* Remove Language::factory and getParentLanguage useAryeh Gregor2019-10-271-9/+12
| | | | Change-Id: I11f8801ef47ec1a1f63d840116e69667e6f3ae3c
* Clean up spacing of doc commentsUmherirrender2019-08-052-3/+3
| | | | | | Align the doc stars and normalize start and end tokens Change-Id: Ib0d92e128e7b882bb5b838bd00c74fc16ef14303
* Remove "Squiz.WhiteSpace.FunctionSpacing" from phpcs exclusionsReedy2019-05-112-0/+2
| | | | Change-Id: I78b3315f26ab91b6b443f5b028a635552f82f5a3
* Fix comments in language class testsFomafix2018-12-252-2/+6
| | | | | | | | * Add `@covers LanguageGa`. * Language code `bs` is for "Bosnian (bosanski)" and not for "Croatian (hrvatski)". Change-Id: I605bdd254518dd708343e36a2dee65dd0aa17b63
* Make Language::hasVariant() more strictC. Scott Ananian2018-10-221-0/+57
| | | | | | | | | | | | | | | | | | | In d59f27aeab08b171e5ab6a081e763a4cad0bca04 we made LanguageConverter::validateVariant() try harder to convert a variant into an acceptable MediaWiki-internal form, looking at deprecated codes and BCP 47 aliases. However, this misled Language::hasVariant() into thinking that bogus names (like all-uppercase strings) were acceptable variant names, which then led exceptions when they were passed to the various conversion methods. This is a belt-and-suspenders patch for T207433 -- in that case we shouldn't have created a Language object with code 'sr-cyrl' in the first place, but once one was created we shouldn't have tried to ask LanguageSr to convert texts to 'sr-cyrl'. The latter problem is fixed by this patch. Bug: T207433 Change-Id: Id993bc7989144b5031a551662e8e492bd23f698a
* Deprecate $wgFixArabicUnicode / $wgFixMalayalamUnicodeC. Scott Ananian2018-10-212-0/+2
| | | | | | | | | These were introduced in MW 1.17 and are always true in production. They were useful to allow folks to defer title conversion, but it's been a long time now. We don't need to make this optional any more. Change-Id: I65dcfe80dc3e1dfeb4d63924a8928655e012a20c
* Write Latin and other scripts with captial letterFomafix2018-10-052-6/+6
| | | | Change-Id: I16c660e54191b63cd6eb3407cb00504665930c4e
* languages: Add coverage for 'ar' and 'ml' normalize()Timo Tijhof2018-08-142-13/+68
| | | | | | | | | * Exclude the data files from PHPUnit coverage. * Add tests covering the normalize() implementations. * Fix a small todo about using data providers. * Set explicit visibility. Change-Id: Ib104cc3215a36901cff853ad5969d92a6e0cf6a0
* (y)etsin fixes, test refactoring, and misc fixestjones2018-05-291-147/+97
| | | | | | | | | | | | | | | | | | * Fix etsin/етсин/этсин as noted in If933fc67845ac994d9ddfdf8349aff445ec9b13a ** only convert tsin to тсин and let the other rules sort out the e * Refactor most tests to be word-specific, which uncovered a couple of bugs in corner cases ** rol/üst prefix matches should match whole words (original [^ü] regex assumed word could not be end of string * Fixed incidental bugs I noticed while looking into the items above ** куркчи => kürkçi was in the wrong section ** cönk => джонк was in the right section, but reversed * Added additional tests cases for all of the above. Change-Id: Ia96be488a7b41c3ddba623b5c9262703b1c82687
* Crimean Tatar/crh transliteration odds and endstjones2018-05-221-0/+9
| | | | | | | | * refactor '\b' into WB const to make it easy to update in the future * add new ц-related exceptions Bug: T193764 Change-Id: Ib707136f8f2598d1f8ec995bf129b436dfb53cd9
* Minor fixes to CRH language conversion.C. Scott Ananian2018-05-121-4/+4
| | | | | | | | | | | | | | * Move a many-to-one mapping from the L2C to the C2L table where it belongs. * Fix some regular expression patterns which ended up with misnumbered replacement strings. * All regular expressions should have the `u` (unicode) flag set. * Typo/spelling fixes in comments Change-Id: If933fc67845ac994d9ddfdf8349aff445ec9b13a
* CRH Transliteration Pattern Matching Fixestjones2018-04-271-12/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor to match exceptions as patterns, not words - break exception list to C2L and L2C pattern sets - change main loop to break only on Roman numerals and transliterate everything else, rather than tokenizing on single-script words (this fixes the km² problem, too) - update word anchors from ^ and $ to \b - only process Roman numerals for L2C translit - add exception for single "Roman" character followed by a period which looks like an initial - consolidate multi-step transliteration into regsConverter() - remove regex support from main exception list to support strtr() - re-organize some prefix/suffix/whole word patterns to the right place - add tests for recently fixed use cases - add support for many-to-one mappings in both directions - update character classes, exception lists, and regexes based on speaker feedback and example texts Misc other fixes: - fix some character classes errors - remove unneeded character classes - add tests for Roman numerals and quotes - add tests for affixes and regexes Bug: T188321 Bug: T189512 Change-Id: I056d36ff2b8f63b3998a5d3a442d8d539c15488d
* Merge "Add Russian grammar forms to support Wikiversity"jenkins-bot2018-03-141-0/+10
|\
| * Add Russian grammar forms to support WikiversityAmire802018-02-261-0/+10
| | | | | | | | Change-Id: I70fcb03db62307116ec96d4c242e6796534b57a1
* | Fix table loading bug for CRH transliterationtjones2018-02-261-1/+17
|/ | | | | | | | | | | | | | | | | In production, the regex and exception tables were not being loaded, resulting in very poor transliteration. The loading has been moved to the contructor, similar to the implementation of the Kazakh transliteration. Also, a bug in the mappings for Ö/ö -> Ё/ё and Ü/ü -> Ю/ю has been fixed. Test cases for specific additional examples have been added. (Though it is worth noting that the regex and exception tables did load properly during unit testing, so the problem wasn't caught there.) Bug: T186727 Change-Id: I6bacee7d9de6f4a870a8a9ef1f04b819ad489c02
* Generalize non-digit-grouping of four-digit numbersBartosz Dziewoński2018-01-023-7/+1
| | | | | | | | | | | | | | | | | | | In some languages it's conventional not to insert a thousands separator in numbers that are four digits long (1000-9999). Rather than copy-paste the custom code to do this between 13 files, introduce another option and have the base Language class handle it. This also fixes an issue in several languages where this logic previously would not work for negative or fractional numbers. To implement this, a new option is added to MessagesXx.php files, `$minimumGroupingDigits = 2;`, with the meaning as defined in <http://unicode.org/reports/tr35/tr35-numbers.html>. It is a little roundabout, but it could allow us to migrate the number formatting (currently all custom code) to some generic library easily. Bug: T177846 Change-Id: Iedd8de5648cf2de1c94044918626de2f96365d48
* build: Updating mediawiki/mediawiki-codesniffer to 15.0.0Umherirrender2018-01-011-2/+2
| | | | | | | | | | | | | Clean up use of @codingStandardsIgnore - @codingStandardsIgnoreFile -> phpcs:ignoreFile - @codingStandardsIgnoreLine -> phpcs:ignore - @codingStandardsIgnoreStart -> phpcs:disable - @codingStandardsIgnoreEnd -> phpcs:enable For phpcs:disable always the necessary sniffs are provided. Some start/end pairs are changed to line ignore Change-Id: I92ef235849bcc349c69e53504e664a155dd162c8
* Add @covers tags to languages testsKunal Mehta2017-12-2825-16/+91
| | | | | | | | I removed comments that merely repeated the location of the class being tested. There are other tests in this directory that don't have a corresponding class and need further investigation. Change-Id: Ic16f0887b5030ac53fab4382cfaedfb5426cdb08
* Crimean Tatar Transliterationtjones2017-11-201-0/+72
| | | | | | | | | | | | | | | | | This is a first pass at Latin/Cyrillic translitertion for Crimean Tatar (crh). Includes transliteration tables, prefix/suffix mappings, regex mappings, and exceptions lists for words and abbreviations. Regularize CRH language name in messages/* files. Fix "varient" typos in qqq.json. Add unit tests for CRH transliteration. Bug: T23582 Change-Id: I424703f99adf837f6217872b882d1ea26bfdd068
* Add test cases for digit grouping (commafy) in PolishBartosz Dziewoński2017-10-101-0/+27
| | | | | | | | | | According to the typographical convention, a thousands separator should not be inserted in numbers that are four digits long (between 1000 and 9999), unlike in English where it's usually acceptable. This logic is currently implemented in LanguagePl::commafy(). Bug: T177846 Change-Id: I6dbd8febcf59000067cdd7d3c11111f2f77f4e66
* tests: Replace implicit Bugzilla bug numbers with Phab onesJames D. Forrester2017-02-213-7/+7
| | | | | | | It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345" except where it doesn't, so let's just standardise on the real numbers. Change-Id: I46261416f7603558dceb76ebe695a5cac274e417
* Make the code for grammar data processing commonAmir E. Aharoni2016-12-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes the code for processing JSON files with grammar transformations reusable by different languages and applies the same logic to Russian and Hebrew. It will be done to other languages in further patches. This patch is not supposed to change any functionality, and the tests are intact (except a comment in the test for Hebrew - the class doesn't exist any longer). PHP: * Move the JSON grammar transformation data processing logic from LanguageRu.php to convertGrammar() in Language.php. By default all these data files are supposed to be processed identically, so the code should be common. If there is no JSON data file, nothing new happens. * LanguageRu's own convertGrammar() method is removed. * The LanguageHe class is removed, now that all its functionality is handled by generic JSON data processing in the Language class. LanguageHe.php file is removed from the repo and from autoloading. JavaScript: * Move the JSON grammar transformation data processing logic from ru.js to mediawiki.language.js. * JavaScript grammar code files he.js and ru.js are removed from the repo and from Resources.php, because all the data is in JSON, and the default logic in mediawiki.language.js works for both languages. Bug: T115217 Change-Id: I5e75467121c3d791bb84f9e6fdfcf07c1840f81a
* Update weblinks in comments from HTTP to HTTPSFomafix2016-10-111-2/+2
| | | | | | Use HTTPS instead of HTTP where the HTTP link is a redirect to the HTTPS link. Change-Id: I06d9e043730accc4ae71b927e0f8229f0fc3b340
* Convert all array() syntax to []Kunal Mehta2016-02-1753-837/+837
| | | | | | | | | | Per wikitech-l consensus: https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html Notes: * Disabled CallTimePassByReference due to false positives (T127163) Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
* Add tests for LanguageConverter classes that didn't have themTim Starling2016-02-087-0/+319
| | | | | | | | Some of them don't have many test cases, or have test cases that don't represent the ideal transliteration and so are subject to change. But this is better than nothing. Change-Id: I4aae693bd77d9ff365f48113923ed7f9fed8d668
* Merge "Add new grammar forms for language names in Russian"jenkins-bot2015-09-281-0/+65
|\
| * Add new grammar forms for language names in RussianAmir E. Aharoni2015-09-281-0/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CLDR provides translated language names. They are useful for showing names by themselves in menus and lists, but it's often problematic to add them to Russian sentences, because they need to be declined, so a message like "This page is not available in the $1 language" is hard to localize. This patch adds new cases for Russian - "languagegen", "languageprep" and "languageadverb". (The last one, as its name says, it's not actually a grammatical case, but a transformation to an adverbial expression.) This covers most of the needs for language names that MediaWiki supports. Change-Id: Ib6a0afa5c3736f8b9b2e121cd752c53ee50fad75
* | Update Ukrainian grammar rules and testsAmir E. Aharoni2015-09-271-0/+10
| | | | | | | | | | | | | | | | | | | | | | * Fix the '-ти' rule to match the name of Wikiquote. * Add tests for '-ти' and '-ник' rules. * Remove the '-ь' and '-ка' rules, which were copied from Russian and are not used in Ukrainian, and remove their tests as well. * Remove non-implemented ("stub") cases. * Cleanup the code of commafy(). Change-Id: I98647ceb8806d845f3c8150b92a5d9f7fe5866f2
* | Update grammar rules and test for UkrainianAmir E. Aharoni2015-09-271-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | The grammar rules for Ukrainian have several mistakes. This is the first in a series of commits that fix this. * Add grammar tests for PHP. There weren't any tests at all, and now there are some. Not tests are added for rules that are wrong and irrelevant and will be removed in subsequent commits. * Add tests for JavaScript, and update a grammar rule that was incorrectly copied from Russian. Change-Id: I6de4581e2908eba39b33a13b07d048a34a3bd803
* | Fix issues identified by SpaceBeforeSingleLineComment sniffVivek Ghaisas2015-09-262-14/+14
|/ | | | Change-Id: I048ccb1fa260e4b7152ca5f09b053defdd72d8f9
* Fix whitespace issues around parenthesesVivek Ghaisas2015-06-161-1/+1
| | | | | | | Fix issues found by MediaWiki.WhiteSpace.SpaceyParenthesis sniff. Bug: T102617 Change-Id: Iec7f71e64081659fba373ec20d9d2006306a98f4
* Move Test files under same folder structure where class is (/languages/)umherirrender2015-01-1048-0/+2641
Change-Id: I25c99272a1c2e318e6c61b4a497bf04886430e9b