Title says it all. I had a user of YASS ask me recently if the script supports RTL text. Well, it does if using Arial Unicode as a fontface, but what I discovered is that the characters are printed backwards / reverse order, with some help from sock earlier.
Example:
Name appears like this in WLM:
Literally, "Adam / MeEtc - Whats up" in Hebrew
Upon upload to the server, it gets stored as this:
Adam%20/%20MeEtc%20-%20%u05DE%u05D4%20%u05E7%u05D5%u05E8%u05D4
When PHP decides to display it, it comes out as:
Literally, "Adam / MeEtc - pu stahW"
The same thing happens to Arabic text as well, and I assume any other RTL language.
So, here's my dilemma, getting the text to display the right way. The 2 options I have are to either reverse the RTL characters before the characters are drawn, or (what makes more sense to me) reverse the order of the characters when sending the text to the server.
On the server, I'm just using imagettftext() and a function to decode the Unicode characters
code:
function unicodedecode($str){
//Decodes unicode characters encoded in URL format, and standard URl encoding
//eg. %20 is a space, %u0123 is a unicode character
$res = '';
$i = 0;
$max = strlen($str) - 6;
while ($i <= $max){
$character = $str[$i];
if ($character == '%' && $str[$i + 1] == 'u'){
$value = hexdec(substr($str, $i + 2, 4));
$i += 6;
if ($value < 0x0080) // 1 byte: 0xxxxxxx
$character = chr($value);
else if ($value < 0x0800) // 2 bytes: 110xxxxx 10xxxxxx
$character = chr((($value & 0x07c0) >> 6) | 0xc0) . chr(($value & 0x3f) | 0x80);
else // 3 bytes: 1110xxxx 10xxxxxx 10xxxxxx
$character = chr((($value & 0xf000) >> 12) | 0xe0) . chr((($value & 0x0fc0) >> 6) | 0x80) . chr(($value & 0x3f) | 0x80);
}else
$i++;
$res .= $character;
}
return urldecode($res . substr($str, $i));
}
In the client, I just have (more or less):
code:
escape(MsgPlus.RemoveFormatCodes(Messenger.MyName))
I think it would probably be easiest if the text is reversed before its sent to the server, but I have no idea on how to accomplish this. sock gave me a basic algorithm of what needs to be done:
quote:
finding the RTL substrings (when the general direction is LTR) goes like this.
you look for an RTL char. it starts an RTL substring. for now it also ends the substring. now you see if you can extend it, so you move on. if you see an RTL char, you move the end to its location. if you see an LTR char (e.g. English letters), you stop the search. if you see a neutral char (e.g. a dot) you just move on to the next char, without changing the end's location.
I have tried searching out on the web on how to get imagettftext to display RTL characters, but just found more questions and confusion. Anyone up for a challenge?