[actually off topic as it will be more about explaining what unicode is and how strings are stored than about using ASM in scripting]
quote:
Originally posted by SmokingCookie
Unfortunately, it doesn't work If I create BinaryData as follows:
JScript code:
var Text = "Hello world!";
var BinaryData = Interop.Allocate((Text.length + 1) * 2);
BinaryData.writeSTRING(0,Text);
Then the debugger output is an empty string for the first method, and an H for the second.
It does work...
If the debug output showed an empty string for the first method then you've made an error in your code.
In both cases it should show
just an 'H' !
It is the correct output for your 'hello world' example.
This because you already starting from a unicode string (note: it is not meant to be used like that). In JScript strings are always already unicode. Hence the very exact reason for this conversion example. So essentially, what you did was converting an unicode string to an unicode string. In other words the second character will always be a null character in that case. And a null character can not be shown in the debug output (but that doesn't mean it isn't there or that there isn't more stuff after that).
In other words, what you did was converting this binary data:
48 00 65 00 6C 00 6C 00 6F 00 20 00 57 00 6F 00 72 00 6C 00 64 00 21 00 (This is the actual content of your variable
Text, a unicode string "Hello World!")
into this:
48 00
00 00 65 00
00 00
6C 00
00 00
6C 00
00 00
6F 00
00 00
20 00
00 00
57 00
00 00
6F 00
00 00
72 00
00 00
6C 00
00 00
64 00
00 00
21 00
00 00
The
bold part shows the first character in the string before conversion and where those same bytes are after the conversion.
The
underlined part is what makes the second (unicode) character in the string before and after the conversion.
The
blue bytes are the original bytes shown where they are before and after the conversation.
The black 0x00 bytes are what is added
(actually skipped) by the function to make the original binary data (eg: binary data you read from the registry) into a real (unicode) string which you can use in JScript to manipulate (eg: to search for certain characters, to extract a certain substring, etc, using the conventional JScript functions and methods).
As you can see, the second (unicode) character after the conversion is a null character, which you can not display, and the output will stop after encountering a null character.
So, instead of displaying the entire string as a whole, I suggest you use JScript's
string.charAt() method to examine what is going on in the strings before and after the conversion. (
charAt() shows the character code of the unicode character).
And/or use this string to start with (it is the "Hello World!" in ansi):
js code:
var Text = "\u6548\u6C6C\u206F\u6F57\u6C72\u2164";
// note that this string is still a unicode string. However here we also use the higher bytes (which would otherwise be 0x00 for the common ascii characters. Thus actually mimicing an ansi string or byte array.
Or change your test code to:
JScript code:
var Text = "Hello world!";
var BinaryData = Interop.Allocate(Text.length + 1);
BinaryData.writeSTRING(0, Text, false);
// note the added parameter 'false' which writes the string as an ansi string into the memory block
[/actually off topic]