diff options
author | Bartosz Dziewoński <matma.rex@gmail.com> | 2017-10-07 02:26:23 +0200 |
---|---|---|
committer | Jforrester <jforrester@wikimedia.org> | 2018-06-04 16:20:13 +0000 |
commit | 0313128b1038de8f2ee52a181eafdee8c5e430f7 (patch) | |
tree | 3367d299f6f27af7d2006f6390944aa9edd1ad31 /languages/messages/MessagesPt.php | |
parent | 4d5b2473a41208816001c6d20fa5c093e9d7615b (diff) | |
download | mediawikicore-0313128b1038de8f2ee52a181eafdee8c5e430f7.tar.gz mediawikicore-0313128b1038de8f2ee52a181eafdee8c5e430f7.zip |
Use PHP 7 "\u{NNNN}" Unicode codepoint escapes in string literals
In cases where we're operating on text data (and not binary data),
use e.g. "\u{00A0}" to refer directly to the Unicode character
'NO-BREAK SPACE' instead of "\xc2\xa0" to specify the bytes C2h A0h
(which correspond to the UTF-8 encoding of that character). This
makes it easier to look up those mysterious sequences, as not all
are as recognizable as the no-break space.
This is not enforced by PHP, but I think we should write those in
uppercase and zero-padded to at least four characters, like the
Unicode standard does.
Note that not all "\xNN" escapes can be automatically replaced:
* We can't use Unicode escapes for binary data that is not UTF-8
(e.g. in code converting from legacy encodings or testing the
handling of invalid UTF-8 byte sequences).
* '\xNN' escapes in regular expressions in single-quoted strings
are actually handled by PCRE and have to be dealt with carefully
(those regexps should probably be changed to use the /u modifier).
* "\xNN" referring to ASCII characters ("\x7F" and lower) should
probably be left as-is.
The replacements in this commit were done semi-manually by piping
the existing "\xNN" escapes through the following terrible Ruby
script I devised:
chars = eval('"' + ARGV[0] + '"').force_encoding('utf-8')
puts chars.split('').map{|char|
'\\u{' + char.ord.to_s(16).upcase.rjust(4, '0') + '}'
}.join('')
Change-Id: Idc3dee3a7fb5ebfaef395754d8859b18f1f8769a
Diffstat (limited to 'languages/messages/MessagesPt.php')
-rw-r--r-- | languages/messages/MessagesPt.php | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/languages/messages/MessagesPt.php b/languages/messages/MessagesPt.php index 78503ccb4657..f57f3228f003 100644 --- a/languages/messages/MessagesPt.php +++ b/languages/messages/MessagesPt.php @@ -110,7 +110,7 @@ $dateFormats = [ 'dmy both' => 'H\hi\m\i\n \d\e j \d\e F \d\e Y', ]; -$separatorTransformTable = [ ',' => "\xc2\xa0", '.' => ',' ]; +$separatorTransformTable = [ ',' => "\u{00A0}", '.' => ',' ]; $linkTrail = '/^([áâãàéêẽçíòóôõq̃úüűũa-z]+)(.*)$/sDu'; # T23168, T29633 $specialPageAliases = [ |