aboutsummaryrefslogtreecommitdiffstats
path: root/languages/messages/MessagesPt.php
diff options
context:
space:
mode:
authorBartosz Dziewoński <matma.rex@gmail.com>2017-10-07 02:26:23 +0200
committerJforrester <jforrester@wikimedia.org>2018-06-04 16:20:13 +0000
commit0313128b1038de8f2ee52a181eafdee8c5e430f7 (patch)
tree3367d299f6f27af7d2006f6390944aa9edd1ad31 /languages/messages/MessagesPt.php
parent4d5b2473a41208816001c6d20fa5c093e9d7615b (diff)
downloadmediawikicore-0313128b1038de8f2ee52a181eafdee8c5e430f7.tar.gz
mediawikicore-0313128b1038de8f2ee52a181eafdee8c5e430f7.zip
Use PHP 7 "\u{NNNN}" Unicode codepoint escapes in string literals
In cases where we're operating on text data (and not binary data), use e.g. "\u{00A0}" to refer directly to the Unicode character 'NO-BREAK SPACE' instead of "\xc2\xa0" to specify the bytes C2h A0h (which correspond to the UTF-8 encoding of that character). This makes it easier to look up those mysterious sequences, as not all are as recognizable as the no-break space. This is not enforced by PHP, but I think we should write those in uppercase and zero-padded to at least four characters, like the Unicode standard does. Note that not all "\xNN" escapes can be automatically replaced: * We can't use Unicode escapes for binary data that is not UTF-8 (e.g. in code converting from legacy encodings or testing the handling of invalid UTF-8 byte sequences). * '\xNN' escapes in regular expressions in single-quoted strings are actually handled by PCRE and have to be dealt with carefully (those regexps should probably be changed to use the /u modifier). * "\xNN" referring to ASCII characters ("\x7F" and lower) should probably be left as-is. The replacements in this commit were done semi-manually by piping the existing "\xNN" escapes through the following terrible Ruby script I devised: chars = eval('"' + ARGV[0] + '"').force_encoding('utf-8') puts chars.split('').map{|char| '\\u{' + char.ord.to_s(16).upcase.rjust(4, '0') + '}' }.join('') Change-Id: Idc3dee3a7fb5ebfaef395754d8859b18f1f8769a
Diffstat (limited to 'languages/messages/MessagesPt.php')
-rw-r--r--languages/messages/MessagesPt.php2
1 files changed, 1 insertions, 1 deletions
diff --git a/languages/messages/MessagesPt.php b/languages/messages/MessagesPt.php
index 78503ccb4657..f57f3228f003 100644
--- a/languages/messages/MessagesPt.php
+++ b/languages/messages/MessagesPt.php
@@ -110,7 +110,7 @@ $dateFormats = [
'dmy both' => 'H\hi\m\i\n \d\e j \d\e F \d\e Y',
];
-$separatorTransformTable = [ ',' => "\xc2\xa0", '.' => ',' ];
+$separatorTransformTable = [ ',' => "\u{00A0}", '.' => ',' ];
$linkTrail = '/^([áâãàéêẽçíòóôõq̃úüűũa-z]+)(.*)$/sDu'; # T23168, T29633
$specialPageAliases = [