Shoutbox

PHP and/or JScript help: RTL text - Printable Version

-Shoutbox (https://shoutbox.menthix.net)
+-- Forum: MsgHelp Archive (/forumdisplay.php?fid=58)
+--- Forum: Skype & Technology (/forumdisplay.php?fid=9)
+---- Forum: Tech Talk (/forumdisplay.php?fid=17)
+----- Thread: PHP and/or JScript help: RTL text (/showthread.php?tid=86003)

PHP and/or JScript help: RTL text by MeEtc on 09-18-2008 at 05:46 AM

Title says it all. I had a user of YASS ask me recently if the script supports RTL text. Well, it does if using Arial Unicode as a fontface, but what I discovered is that the characters are printed backwards / reverse order, with some help from sock earlier.

Example:
Name appears like this in WLM: [Image: 1.png]
Literally, "Adam / MeEtc - Whats up" in Hebrew
Upon upload to the server, it gets stored as this:
Adam%20/%20MeEtc%20-%20%u05DE%u05D4%20%u05E7%u05D5%u05E8%u05D4

When PHP decides to display it, it comes out as: [Image: 2.png]
Literally, "Adam / MeEtc - pu stahW"

The same thing happens to Arabic text as well, and I assume any other RTL language.

So, here's my dilemma, getting the text to display the right way. The 2 options I have are to either reverse the RTL characters before the characters are drawn, or (what makes more sense to me) reverse the order of the characters when sending the text to the server.

On the server, I'm just using imagettftext() and a function to decode the Unicode characters

code:
function unicodedecode($str){
//Decodes unicode characters encoded in URL format, and standard URl encoding
//eg. %20 is a space, %u0123 is a unicode character
    $res = '';
    $i = 0;
    $max = strlen($str) - 6;
    while ($i <= $max){
        $character = $str[$i];
        if ($character == '%' && $str[$i + 1] == 'u'){
            $value = hexdec(substr($str, $i + 2, 4));
            $i += 6;
            if ($value < 0x0080) // 1 byte: 0xxxxxxx
                $character = chr($value);
            else if ($value < 0x0800) // 2 bytes: 110xxxxx 10xxxxxx
                    $character = chr((($value & 0x07c0) >> 6) | 0xc0) . chr(($value & 0x3f) | 0x80);
                else // 3 bytes: 1110xxxx 10xxxxxx 10xxxxxx
                    $character = chr((($value & 0xf000) >> 12) | 0xe0) . chr((($value & 0x0fc0) >> 6) | 0x80) . chr(($value & 0x3f) | 0x80);
        }else
            $i++;
        $res .= $character;
    }
    return urldecode($res . substr($str, $i));
}


In the client, I just have (more or less):
code:
escape(MsgPlus.RemoveFormatCodes(Messenger.MyName))

I think it would probably be easiest if the text is reversed before its sent to the server, but I have no idea on how to accomplish this. sock gave me a basic algorithm of what needs to be done:

quote:
finding the RTL substrings (when the general direction is LTR) goes like this.

you look for an RTL char. it starts an RTL substring. for now it also ends the substring. now you see if you can extend it, so you move on. if you see an RTL char, you move the end to its location. if you see an LTR char (e.g. English letters), you stop the search. if you see a neutral char (e.g. a dot) you just move on to the next char, without changing the end's location.


I have tried searching out on the web on how to get imagettftext to display RTL characters, but just found more questions and confusion. Anyone up for a challenge?
RE: PHP and/or JScript help: RTL text by Jarrod on 09-18-2008 at 06:30 AM

i know it's not quite what you wanted as we all know i don't code php/jscript
but this is my rendition in python and why i love  it,

code:

def reverse_rtl(x):
    beta = <rtl characters including space>
    count  = 0
    y = []
    zz = []
    s = ""
    a = ""
    b = ""
    for letter in x:
        #print count
        if letter in beta:
            y.append(letter)
        else:
            zz.append(letter)
           
           
        count += 1

    y = y[::-1]
    for item in y:
        s = s + item
    for item in zz:
        a = a + item
        if zz[-1] != " " and zz[-1] == item:
            a = a + " "

    b = a + s

    return b



u just need some one to translate it

[edit: just added an if statement and fixed it to return the whole string instead of just the rtl]

bug: it does strip the spaces off ltr text with more than 1 space, but rtl text is fine
RE: PHP and/or JScript help: RTL text by MeEtc on 09-18-2008 at 07:17 PM

As I understand it, this would require an array of all characters that require reversal? thats not really going to be a viable option.


RE: PHP and/or JScript help: RTL text by Jarrod on 09-19-2008 at 07:04 AM

yes that is how i tackle the problem, i don't know how else you could do it though:/