| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have to use a tertiary sortkey for everything with the primary
sortkey of 2627. Otherwise, the "Remove duplicate prefixes" logic
in IcuCollation would remove them.
The following characters will now be considered separate letters in
the 'xx-uca-fa' collation for the purpose of displaying the headings
on category pages: ء ئ ا و ٲ ٳ
Bug: T139110
Change-Id: Ibbea5d76348e4cdc38b74cba44286910b2ed592f
|
|
|
|
|
|
|
|
| |
Instances of subclasses of IcuCollation with customizations for
specific languages probably shouldn't share this cache with instances
of IcuCollation with the same language.
Change-Id: I06d66d199c99448a3375381baef0366c4d99c8c4
|
|\ |
|
| |
| |
| |
| |
| | |
Bug: T139110
Change-Id: Ie15a2ee1c22ff4a1d2b721ed137227fe83dd12ea
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This will cause the numeric collation to sort localized digits
for the current content language the same as how 0-9 are.
This only deals with the localized digit numbers, commas
and other number formatting are still not handled. Weird
"numerical" unicode characters are also not handled.
I was unsure if to make a "family" of numeric collations
where you specify numeric-<lang code>, or if it should
just use $wgContLang. Given that $wgContLang effectively
never changes, and also affects all other digit handling,
I opted to just use $wgContLang.
Any wikis currently using the 'numeric' collation will
have to have updateCollation.php --force run after this
change is deployed. At the moment that includes:
bnwiki, bnwikisource and hewiki
Bug: T148873
Change-Id: I9eda52a8a9752a91134d1118546b0a80d3980ccf
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is based solely on looking at the bn.txt collation data
file. It has not been tested by native speakers.
Bug: T148885
Change-Id: Ide926bc5ee8752269ef6a1bfe972e19b7188d193
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
At this point I think it's safe to assume that these mostly work well,
and the split makes maintenance of the alphabetical list more difficult
(some entries were already in wrong order). We've been enabling these
collations for more and more Wikimedia wikis and not hearing about any
problems. Mistakes, if any are present, should be treated like any
other bug.
Also made some comments consistent.
Change-Id: I4b5fbcf4dbbdd4dc194ed821341296171fa64bb0
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| | |
Based on CLDR 29 data files.
This did the relatively easy languages in CLDR 29 (Which is most
of them). I skipped languages with complicated tailoring files.
Change-Id: I8367604f7d3a1cdef9cb4e15813893c8cbfff1ff
|
| |
| |
| |
| |
| | |
Bug: T148774
Change-Id: I34aa330645d9d82b6c4e57542e891dd2b36e42ad
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A few more languages marked as "Verified by native speakers",
based on which collations we've been using in production
on Wikimedia wikis.
(I'm not sure if this makes sense now that we're fairly confident
that these are good in general, but since it's already here...)
Change-Id: I8e1f31fa61509eca8c76a2df4e18638005e68b77
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This collation orders text with numbers "naturally", so that
'Foo 1' < 'Foo 2' < 'Foo 12'.
Note that this only works in terms of sequences of digits, and the
behavior for decimal fractions or pretty-formatted numbers may be
unexpected.
This is only expected to work mostly correctly for English-language
text. Consider it a proof of concept. You probably want to use
an UCA collation with '-u-kn' suffix rather than this.
Bug: T8948
Change-Id: Ie268f2d92c5c75d0aaecf54ede2bdda1af3b309d
|
|/
|
|
|
|
|
| |
Per https://ssl.icu-project.org/trac/browser/icu/trunk/source/data/coll/mk.txt
Bug: T26953
Change-Id: I45938402923a109cfc80f59555af5cede584fc3b
|
|
|
|
|
|
|
|
| |
To use, add '-u-kn' to the end of a collation name and set it as
the value for $wgCategoryCollation.
Bug: T8948
Change-Id: Ica7908daf80624fa2648127114d01665e96234c0
|
|\ |
|
| |
| |
| |
| | |
Change-Id: I35c2cdd2c56b491229f1f6d8b69b1de21af23aab
|
|/
|
|
|
| |
Bug: T139110
Change-Id: If174e02160c954500233e3a57945e267f2b4ae29
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
First letters are based on
https://ssl.icu-project.org/trac/browser/icu/trunk/source/data/coll/ta.txt
This commit has not been verified by a native speaker yet, but
is probably right.
Bug: T75453
Change-Id: Ic9bb3658868917790aa770c99f8f280f2dd3eace
|
|/
|
|
|
|
|
|
|
|
|
| |
Small optimization to IcuCollation::fetchFirstLetterData().
This used to suppress / restore warnings once per every letter of
every alphabet. The workaround for string casting and error
suppression is no longer needed as of PHP 5.3, in which the
bug was fixed.
Change-Id: Idd41a509858c0887df4f632b480b387bd74027b2
|
|
|
|
|
|
|
|
|
| |
* Factor out fetchFirstLetterData() as a separate method.
* Move 'version' into the key instead of checking afterwards.
* Use getWithSetCallback() for the cache handling.
(Depends on version being in the key).
Change-Id: I15bddf5d1dabcdcef47a938447ba59436bd8a294
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed that `frwiki:first-letters:fr:fr:4.8.1.1` was at the very top of keys
sorted by bandwidth (that is, reqs/sec * size) on one of the memcache servers
on WMF prod.
The data takes ~60 - 80ms to compute, in case of a cache miss. That's not
enough to justify using a tiered cache abstraction here, IMO.
Change-Id: If81ce8f86f2c378565f1f6a0dd2c04dee825c4e9
|
|
|
|
| |
Change-Id: Iec56ac4d1418737d171f8faa9c8f498fba5383ee
|
|
Change-Id: I6abfecf91cdce83dd34b1e8aa8e0b35315f62742
|