What happened to the Messenger Plus! forums on msghelp.net?
Shoutbox » MsgHelp Archive » Skype & Technology » Tech Talk » File format suggestions

File format suggestions
Author: Message:
Choli
Elite Member
*****

Avatar
Choli

Posts: 4714
Reputation: 42
42 / Male / Flag
Joined: Jan 2003
RE: File format suggestions
quote:
Originally posted by CookieRevised
File Chunks : Unlimited packs of data which each have 7 parts:


Name (limited to 255 bytes) - Chunk name, should be a unique identifier, not forced tho.
Comments: Make the Name shorter. 255 bytes are too much of a waste of space. Make it like the "File Type", so 5 bytes. That's more then enough to make all kind of different chuncks... Also, make it forced, I mean it should be required IMO (to be consistent with the overall global file format you're creating).

Version (1 byte) - Same as "File Version". Adds the same advantages but then for individual chunk-types...
Comments: Concearning the chunck-checksum (see below): You could reserve bit 8 to imply that there is a checksum or not. In this way the checksum could be optional.

Comments Length (1 byte) - To identify the length of the comment, otherwise you wont know where to comment begins or ends. Unless you will always use 255 bytes. But for short comments this is a waste of space.

Comments (limited to 255 bytes) - Like the file comments, additional information for this chunk.
I agree with cookie on his format, however i'd improve what is in the quote: Name and comments are human-readable strings, I mean: they have no weird symbols, only characters; so you can use null-terminated strings: The size required would be the same (because you add a null character at the end, but you no longer need the byte identifying the lenght) and this way you can have strings of any size. You may want to have a 500-byte comments on the chunk. In any case, the recommendation of making the name and comments as short as possible still applies.

A 2nd improvment, would be the (optional) use of unicode strings in name and comments. As you know, unicode strings have (may have) null characters very often and you'd need something to distinguish between the null bytes of the characters in an unicode string and the null byte in an ansi string that means end-of-string.

That can be done, puting the bytes 255 and 254 at the begining of the string. If those bytes are there, the string is unicode and ends when you find 2 null bytes (ie: 1 null unicode character). If the string begins with the characters 254 and 255 (note the order), that means the string is unicode but big endian. In any other case, the string is ansi.
quote:
Originally posted by CookieRevised
Checksum (x bytes; depend on what kind of checksum you use) - You could add a checksum to the chunck to make it possible to verify the integrity of your data.
[i]Comments: But that would imply reading/saving/checking the data, which could mean slow-processing. On the other hand, you can create your own type of checksum (only take the hash of byte 10 thru byte 100 or something). This has some advantages: since it only checks some bytes and not all, the speed wouldn't be as slow as if you would check the whole chunck-data. And people who wanna "hack" your fileformat will have a hard time doing it, because they don't know how the checksum is calculated.
this should be optional and this should be said in (for example) one bit in the version byte.

in case of using the checksum, it should be applied to all the chuck. today there are quite fast algorithms to compute a CRC32 or a MD5 very fast (for example, once i mde a program in vb that takes a file and calculates its CRC32. It worked at about 600 Kb/s, and that's a good speed, because VB doesn't have the needed support for some kind of operations and this slows the speed. I'm sure that the same program (well) done in C would go at 2 Mb/s or faster)
Messenger Plus! en espaņol:
<< http://www.msgpluslive.es/ >>
<< http://foro.msgpluslive.es/ >>
:plus4:
06-12-2004 12:22 AM
Profile PM Find Quote Report
« Next Oldest Return to Top Next Newest »

Messages In This Thread
File format suggestions - by Millenium_edition on 06-11-2004 at 02:24 PM
RE: File format suggestions - by Concord Dawn on 06-11-2004 at 02:50 PM
RE: File format suggestions - by Millenium_edition on 06-11-2004 at 03:02 PM
RE: File format suggestions - by CookieRevised on 06-11-2004 at 11:51 PM
RE: File format suggestions - by Choli on 06-12-2004 at 12:22 AM
RE: File format suggestions - by CookieRevised on 06-12-2004 at 01:45 AM
RE: File format suggestions - by Choli on 06-12-2004 at 10:09 AM
RE: File format suggestions - by Millenium_edition on 06-12-2004 at 12:08 PM
RE: File format suggestions - by Choli on 06-12-2004 at 01:30 PM


Threaded Mode | Linear Mode
View a Printable Version
Send this Thread to a Friend
Subscribe | Add to Favorites
Rate This Thread:

Forum Jump:

Forum Rules:
You cannot post new threads
You cannot post replies
You cannot post attachments
You can edit your posts
HTML is Off
myCode is On
Smilies are On
[img] Code is On