Hex Editor

Last modified: February 24 2014 12:52:42.
For infos or suggestions, see the credits page

File System structure

Editing Basics

Hex/DC examples

RevEngEd 2

Opening a binary file (.bin, .4ds, ...) with the hex editor will show something like this: image*. From now on, all this guide will refer to Tiny Hexer, but you are not bound to it. A generic hex editor will do the same.
These files are not plain text and they're not immediately readable. It's basically a serie of 0 and 1, grouped into blocks of 8, called bytes. Every byte is represented by either a hexadecimal value (1E, 00, 47, ...) and a often meaningless ASCII character (Ê, Š, ., ...), and cointains 256 possible values (2^8).
There are several ways for interpreting bytes. And the hex editor will provide an useful tool for each of them. Just click on "Tools > Value editor" and you'll get a tiny windows with some options: image. A byte it's only a byte (a meaningless serie of 0 and 1) until you or H&D2 choose to interpret it in a particular way. The simplest way it's to consider it an integer number from 0 to 255. Tiny Hexer call this type "+ BYTE (8 bits)". Here you can see that "EB" is 235, if interpreted as a unsigned byte. Just changing type from "+ BYTE (8 bits)" to "± BYTE (8 bits)" leads to a different interpretation: "EB" now means -21.
Another simple way for interpreting is using ASCII code, which assign to every number a letter. In fact, printable letters are few than 256, so only some numbers are visualized as common letters, whilst the other ones are visualized as strange symbols or anonymous points. You can notice, changing the value type to "CHAR (8 bits)", that the tool shows the very same value as the columns on the right: image. You can also notice quite long sequences of ASCII readable bytes, which form meaningful words. These sequences are called strings, and in this case we can read the filename of a texture: "X_FLGALL.TGA".
Things are rarely that simple. Most frequent values use a combination of several bytes, and the order in which these bytes must be read changes. There are basically two ways: "Big Endian" and "Little Endian". H&D2 uses always "Little Endiand", so be sure not to check the "Big endian" option in the value editor.
Non-integer values in H&D2 are rapresented by floats (floating point values), and use 4 bytes at a time. In this image you can see what happens by changing the type to "SINGLE Float (IEEE, 32 bits)": 4 bytes are highlighted in yellow, indicating that they are all used to determine the current value (˜0.0449). Notice that checking the "Big endian" option, the value changes a lot (˜0 because of E^-20): image. Notice also that the file won't tell you where a certain float value starts or ends. In this image, I shifted the interpretation of bytes by two positions: I have a new value (˜-1.91 E^26) which is composed by 2 bytes that previously represented ˜0.0449 and 2 new bytes.
Understanding which bytes are integer, which are float, where floats start, etc. can be tricky. Usually, for floats, you have to search for series of 4-bytes that are between -10.0 and +10.0 and then you can assume that they are a serie of float values. In the previous example (˜0.0449) I'll have 0.0, ˜0.0449, ˜0.0449, ˜0.0449, which it's surely better than 6.77 E^-21, 1.00 E^-20, 1.00 E^-20, 5.62 E^-39 (shifting right by 1 position).
But there could be more doubtful situation. For example, 6.77 E^-21 could be a bad rounded zero**. Or look at this case: image. It's a serie of 1.0 little-endian (..€?) or a serie of 0.5 big-endian (?..€)? Both values seems ok: 0.5 has some infinitesimal values, but when it's about float numbers, clean numbers are rare**. And there are some 00 bytes both before and after these 4-bytes blocks, so boundaries cannot be used to determine which interpretation is the right one. I now know that 1.0 is the right one, because H&D2 uses little-endian format, but this very case made me frowning at the beginning of my experience.
Float numbers was a very good example to introduce N-byte number rapresentation, but they're not the only ones used. Another tipical value contained in H&D2 files is the short int. It's an integer value composed by 2 bytes, and can hold 65536 different values: 2^(8+8). Again, short int can be unsigned or signed (in this case, values range from -32768 to + 32767). Again, short int can be written little-endian or big-endian (H&D2 uses little endian). Again, I can't know if a short int begins at a certain byte or the next one.
Tiny Hexer calls there values "+ WORD (16 bits)" and "± WORD (16 bits)" and here there is an example: image. Notice that I voluntarily chose the same bytes previously used in the float example: remember, they are just 0 and 1 and it's up to you (or H&D2) how to interpret them!

* this is only a small portion of a file.
** 0.5000076 is as good as 0.5, because the PC makes some inner conversions, that alter a bit the values inserted by a human user.