| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Depending on which namespace we want these classes to have after
T166010 they could either stay in includes/languages/ (plural) in
their own MediaWiki\Languages\-namespace dedicated to Language
subclasses, or they could go in into a subdirectory like
`includes/language/languages/` if we want to keep them in the same
top-level namespace as other Language classes and services, but in
a more nested namespace.
For now, I've made the smaller change and kept the Language subclasses
in their own directory directly under includes/, not nested further.
Bug: T225756
Change-Id: I01015424707b442853879fd50c97f00215e5c2fa
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* yeild -> yield
* paramter -> parameter
* seperator -> separator
* neccesary -> necessary
* inital -> initial
* intial -> initial
* repsonse -> response
* retreived -> retrieved
Bug: T201491
Change-Id: I461941b027590997448f3bdd8a137a48bb338beb
|
|
|
|
| |
Change-Id: Ic663e81cca0bf007804a70772250914a85f1fef4
|
|
|
|
|
|
| |
Also fix some whitespaces
Change-Id: Ibed50a4f07442d3f299cf545c16f5dbb5f27a411
|
|
|
|
|
| |
Bug: T10327
Change-Id: I4b315d439fef7d7cdf2fc5ae1904e0460a2a60e0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Update digitGroupingPattern to match CLDR 31: New versions of CLDR has
digit grouping pattern with decimal part. Update digitGroupingPattern
values in Message classes with this improved pattern.
Refer: http://unicode.org/reports/tr35/tr35-numbers.html
* Refer the following chart for the decimal patterns.
http://www.unicode.org/cldr/charts/31/by_type/numbers.number_formatting_patterns.html
* Uses PHP NumberFormatter class for the commafy implementation, which
is available in PHP 7.
* Some tests need to update to match the TR 35 spec
* The formatNum public method in Language.php is the preferred way to
use this feature. It does separator transformation and digit transformation
wherever applicable.
* Renamed the second param name for formatNum from noCommafy to noSeparators
* commafy method is deprecated and formatNum is preferred. Practically,
we are not just adding comma, but seperators according to the language.
Replaced some tests based on commafy methods with tests based on formatNum.
Note: The corresponding js implementation is not changed in this commit.
It would probably be a good idea to use globalize.js, which is also based
on the CLDR patterns.
Note: This patch preserves the existing off-by-one error in
$minimumGroupingDigits; T262500 will eventually fix this.
Bug: T167088
Co-Authored-By: C. Scott Ananian <cscott@cscott.net>
Change-Id: Ic721b9a91e78e4ef07040339d1006b7a90a910c0
|
|
|
|
|
| |
Bug: T257500
Change-Id: I30a892a936c0ed9247bc6b63be747697cb9f3e26
|
|
|
|
|
|
|
|
| |
Prepending a default preposition to locative GRAMMAR forms
in Ukrainian language
Bug: T149550
Change-Id: I4649549dc3c722e53c7ea3accb6747df420e56f7
|
|
|
|
|
|
|
|
|
| |
Done with `composer fix` and suppressing the rest (i.e. sniffs for
global variables, which for core should be suppressed anyway).
Additionally, add `-p` to `phpcbf`, as otherwise it just seems stuck.
Change-Id: Ide8d6cdd083655891b6d654e78440fbda81ab2bc
|
|
|
|
|
|
|
| |
reduce/deprecate visibility of some members of the Language class
Bug: T243913
Change-Id: I6bad608455ceaa46f895f00dcc6380cec6d32680
|
|
|
|
|
|
| |
Bug: T241352
Bug: T241353
Change-Id: Idefd5624d761fad4a6c3cca950ce1038a4dec770
|
|\
| |
| |
| | |
easier navigation through tests"
|
| |
| |
| |
| |
| |
| |
| | |
navigation through tests
Bug: T226833, T243761
Change-Id: Ied7d4a1db661f5cfaefe6c392348ff56b1a5616c
|
|/
|
|
|
| |
Bug: T226833, T243760
Change-Id: I6fc7f267098d663fbefd0e78457726c343c9b3e4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Done:
* Replace LanguageConverter::newConverter by LanguageConverterFactory::getLanguageConverter
* Remove LanguageConverter::newConverter from all subclasses
* Add LanguageConverterFactory integration tests which covers all languages by their code.
* Caching of LanguageConverters in factory
* Make all tests running (hope that's would be enough)
* Uncomment the deprecated functions.
* Rename FakeConverter to TrivialLanguageConverter
* Create ILanguageConverter to have shared ancestor
* Make the LanguageConverter class abstract.
* Create table with mapping between lang code and converter instead of using name convention
* ILanguageConverter @internal
* Clean up code
Change-Id: I0e4d77de0f44e18c19956a1ffd69d30e63cf51bf
Bug: T226833, T243332
|
|
|
|
| |
Change-Id: I86fc55a4fc8ceafe368692173211bbcd6d8581d7
|
|
|
|
|
|
| |
Per inline @todo tag, split the test
Change-Id: I38f01c2215b9ba09d6af117ae70b242de362b51f
|
|
|
|
| |
Change-Id: I11f8801ef47ec1a1f63d840116e69667e6f3ae3c
|
|
|
|
|
|
| |
Align the doc stars and normalize start and end tokens
Change-Id: Ib0d92e128e7b882bb5b838bd00c74fc16ef14303
|
|
|
|
| |
Change-Id: I78b3315f26ab91b6b443f5b028a635552f82f5a3
|
|
|
|
|
|
|
|
| |
* Add `@covers LanguageGa`.
* Language code `bs` is for "Bosnian (bosanski)" and not for "Croatian
(hrvatski)".
Change-Id: I605bdd254518dd708343e36a2dee65dd0aa17b63
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In d59f27aeab08b171e5ab6a081e763a4cad0bca04 we made
LanguageConverter::validateVariant() try harder to convert a variant
into an acceptable MediaWiki-internal form, looking at deprecated
codes and BCP 47 aliases. However, this misled Language::hasVariant()
into thinking that bogus names (like all-uppercase strings) were
acceptable variant names, which then led exceptions when they were
passed to the various conversion methods.
This is a belt-and-suspenders patch for T207433 -- in that case we
shouldn't have created a Language object with code 'sr-cyrl' in the
first place, but once one was created we shouldn't have tried to
ask LanguageSr to convert texts to 'sr-cyrl'. The latter problem
is fixed by this patch.
Bug: T207433
Change-Id: Id993bc7989144b5031a551662e8e492bd23f698a
|
|
|
|
|
|
|
|
|
| |
These were introduced in MW 1.17 and are always true in production.
They were useful to allow folks to defer title conversion, but it's
been a long time now. We don't need to make this optional any more.
Change-Id: I65dcfe80dc3e1dfeb4d63924a8928655e012a20c
|
|
|
|
| |
Change-Id: I16c660e54191b63cd6eb3407cb00504665930c4e
|
|
|
|
|
|
|
|
|
| |
* Exclude the data files from PHPUnit coverage.
* Add tests covering the normalize() implementations.
* Fix a small todo about using data providers.
* Set explicit visibility.
Change-Id: Ib104cc3215a36901cff853ad5969d92a6e0cf6a0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix etsin/етсин/этсин as noted in If933fc67845ac994d9ddfdf8349aff445ec9b13a
** only convert tsin to тсин and let the other rules sort out the e
* Refactor most tests to be word-specific, which uncovered a couple of
bugs in corner cases
** rol/üst prefix matches should match whole words (original [^ü] regex
assumed word could not be end of string
* Fixed incidental bugs I noticed while looking into the items above
** куркчи => kürkçi was in the wrong section
** cönk => джонк was in the right section, but reversed
* Added additional tests cases for all of the above.
Change-Id: Ia96be488a7b41c3ddba623b5c9262703b1c82687
|
|
|
|
|
|
|
|
| |
* refactor '\b' into WB const to make it easy to update in the future
* add new ц-related exceptions
Bug: T193764
Change-Id: Ib707136f8f2598d1f8ec995bf129b436dfb53cd9
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Move a many-to-one mapping from the L2C to the C2L table where it
belongs.
* Fix some regular expression patterns which ended up with misnumbered
replacement strings.
* All regular expressions should have the `u` (unicode) flag set.
* Typo/spelling fixes in comments
Change-Id: If933fc67845ac994d9ddfdf8349aff445ec9b13a
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactor to match exceptions as patterns, not words
- break exception list to C2L and L2C pattern sets
- change main loop to break only on Roman numerals and transliterate
everything else, rather than tokenizing on single-script words
(this fixes the km² problem, too)
- update word anchors from ^ and $ to \b
- only process Roman numerals for L2C translit
- add exception for single "Roman" character followed by a period
which looks like an initial
- consolidate multi-step transliteration into regsConverter()
- remove regex support from main exception list to support strtr()
- re-organize some prefix/suffix/whole word patterns to the right place
- add tests for recently fixed use cases
- add support for many-to-one mappings in both directions
- update character classes, exception lists, and regexes based on
speaker feedback and example texts
Misc other fixes:
- fix some character classes errors
- remove unneeded character classes
- add tests for Roman numerals and quotes
- add tests for affixes and regexes
Bug: T188321
Bug: T189512
Change-Id: I056d36ff2b8f63b3998a5d3a442d8d539c15488d
|
|\ |
|
| |
| |
| |
| | |
Change-Id: I70fcb03db62307116ec96d4c242e6796534b57a1
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In production, the regex and exception tables were not being loaded,
resulting in very poor transliteration. The loading has been moved to
the contructor, similar to the implementation of the Kazakh
transliteration.
Also, a bug in the mappings for Ö/ö -> Ё/ё and Ü/ü -> Ю/ю has been
fixed.
Test cases for specific additional examples have been added. (Though
it is worth noting that the regex and exception tables did load
properly during unit testing, so the problem wasn't caught there.)
Bug: T186727
Change-Id: I6bacee7d9de6f4a870a8a9ef1f04b819ad489c02
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some languages it's conventional not to insert a thousands
separator in numbers that are four digits long (1000-9999).
Rather than copy-paste the custom code to do this between 13 files,
introduce another option and have the base Language class handle it.
This also fixes an issue in several languages where this logic
previously would not work for negative or fractional numbers.
To implement this, a new option is added to MessagesXx.php files,
`$minimumGroupingDigits = 2;`, with the meaning as defined in
<http://unicode.org/reports/tr35/tr35-numbers.html>. It is a little
roundabout, but it could allow us to migrate the number formatting
(currently all custom code) to some generic library easily.
Bug: T177846
Change-Id: Iedd8de5648cf2de1c94044918626de2f96365d48
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up use of @codingStandardsIgnore
- @codingStandardsIgnoreFile -> phpcs:ignoreFile
- @codingStandardsIgnoreLine -> phpcs:ignore
- @codingStandardsIgnoreStart -> phpcs:disable
- @codingStandardsIgnoreEnd -> phpcs:enable
For phpcs:disable always the necessary sniffs are provided.
Some start/end pairs are changed to line ignore
Change-Id: I92ef235849bcc349c69e53504e664a155dd162c8
|
|
|
|
|
|
|
|
| |
I removed comments that merely repeated the location of the class being
tested. There are other tests in this directory that don't have a
corresponding class and need further investigation.
Change-Id: Ic16f0887b5030ac53fab4382cfaedfb5426cdb08
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a first pass at Latin/Cyrillic translitertion for Crimean
Tatar (crh).
Includes transliteration tables, prefix/suffix mappings, regex
mappings, and exceptions lists for words and abbreviations.
Regularize CRH language name in messages/* files.
Fix "varient" typos in qqq.json.
Add unit tests for CRH transliteration.
Bug: T23582
Change-Id: I424703f99adf837f6217872b882d1ea26bfdd068
|
|
|
|
|
|
|
|
|
|
| |
According to the typographical convention, a thousands separator
should not be inserted in numbers that are four digits long (between
1000 and 9999), unlike in English where it's usually acceptable.
This logic is currently implemented in LanguagePl::commafy().
Bug: T177846
Change-Id: I6dbd8febcf59000067cdd7d3c11111f2f77f4e66
|
|
|
|
|
|
|
| |
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.
Change-Id: I46261416f7603558dceb76ebe695a5cac274e417
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the code for processing JSON files with
grammar transformations reusable by different languages
and applies the same logic to Russian and Hebrew.
It will be done to other languages in further patches.
This patch is not supposed to change any functionality,
and the tests are intact (except a comment in the test
for Hebrew - the class doesn't exist any longer).
PHP:
* Move the JSON grammar transformation data processing logic
from LanguageRu.php to convertGrammar() in Language.php.
By default all these data files are supposed to be
processed identically, so the code should be common.
If there is no JSON data file, nothing new happens.
* LanguageRu's own convertGrammar() method is removed.
* The LanguageHe class is removed, now that all its functionality
is handled by generic JSON data processing in the Language class.
LanguageHe.php file is removed from the repo and from autoloading.
JavaScript:
* Move the JSON grammar transformation data processing logic
from ru.js to mediawiki.language.js.
* JavaScript grammar code files he.js and ru.js are removed
from the repo and from Resources.php, because all the data
is in JSON, and the default logic in mediawiki.language.js
works for both languages.
Bug: T115217
Change-Id: I5e75467121c3d791bb84f9e6fdfcf07c1840f81a
|
|
|
|
|
|
| |
Use HTTPS instead of HTTP where the HTTP link is a redirect to the HTTPS link.
Change-Id: I06d9e043730accc4ae71b927e0f8229f0fc3b340
|
|
|
|
|
|
|
|
|
|
| |
Per wikitech-l consensus:
https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html
Notes:
* Disabled CallTimePassByReference due to false positives (T127163)
Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
|
|
|
|
|
|
|
|
| |
Some of them don't have many test cases, or have test cases that don't
represent the ideal transliteration and so are subject to change. But
this is better than nothing.
Change-Id: I4aae693bd77d9ff365f48113923ed7f9fed8d668
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
CLDR provides translated language names. They are useful for showing
names by themselves in menus and lists, but it's often problematic to add them
to Russian sentences, because they need to be declined, so a message like
"This page is not available in the $1 language" is hard to localize.
This patch adds new cases for Russian -
"languagegen", "languageprep" and "languageadverb".
(The last one, as its name says, it's not actually a grammatical case,
but a transformation to an adverbial expression.)
This covers most of the needs for language names that MediaWiki supports.
Change-Id: Ib6a0afa5c3736f8b9b2e121cd752c53ee50fad75
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* Fix the '-ти' rule to match the name of Wikiquote.
* Add tests for '-ти' and '-ник' rules.
* Remove the '-ь' and '-ка' rules, which were copied from Russian
and are not used in Ukrainian, and remove their tests as well.
* Remove non-implemented ("stub") cases.
* Cleanup the code of commafy().
Change-Id: I98647ceb8806d845f3c8150b92a5d9f7fe5866f2
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The grammar rules for Ukrainian have several mistakes.
This is the first in a series of commits that fix this.
* Add grammar tests for PHP. There weren't any tests at all,
and now there are some. Not tests are added for rules that
are wrong and irrelevant and will be removed in subsequent commits.
* Add tests for JavaScript, and update a grammar rule that was
incorrectly copied from Russian.
Change-Id: I6de4581e2908eba39b33a13b07d048a34a3bd803
|
|/
|
|
| |
Change-Id: I048ccb1fa260e4b7152ca5f09b053defdd72d8f9
|
|
|
|
|
|
|
| |
Fix issues found by MediaWiki.WhiteSpace.SpaceyParenthesis sniff.
Bug: T102617
Change-Id: Iec7f71e64081659fba373ec20d9d2006306a98f4
|
|
Change-Id: I25c99272a1c2e318e6c61b4a497bf04886430e9b
|