Authors: Rahul Verma, Chetan Giridhar
Have you tried opening a binary file in notepad? Try it and you would see a lot of gibberish. Or we should say gibberish for human users but optimised for the software that deals with a given binary format.
Try the same for a PE (Portable Executable) file (Dll/Exe) in a notepad. What do you find? Were you able to read what’s in there?
You can surely read the first two letters “MZ” at the beginning of each of these files! With a little homework you can know that every binary file format starts with what is commonly called as “Magic Bytes”. These are the first 2-3 bytes in the file which tell the type of file. Focus of this small article is not the PE file format as such but on the history of how “MZ” came into being as the magic bytes for PE format and why not something as simple as “PE”?
This article discusses about the file format of the PE files in 32 bit and 64 bit versions of Windows with an emphasis on MZ header section. It aims to inform the readers about a tool (PEViewer) that can help users to read through the MZ header of the PE file.
Snapshot below depicts the PE file format used by 32/64 bit windows platforms.
The name Portable Executable came from the thought that the all the Windows platforms can run these executables. As per Wikipedia:
The PE format is a data structure that encapsulates the information necessary for the Windows OS loader to manage the wrapped executable code.
Let’s talk about the first two sections i.e., MZ Headers and the Stub.
The two characters “MZ” refer to the legendary Mark Zbikowski, a former Microsoft Architect. A Win32 executable file contains two headers, MZ header followed by the PE Header. The MZ Headers are provided in the executable so that there is a provision for the users to runs these executables from DOS(Disk Operating System). Some classical examples of such executables are notepad.exe, regedit.exe that are fround in %ROOTDRIVE%\Windows\System32 directory. Once the DOS recognizes the two magic characters “MZ” , DOS knows its a valid executable file and starts running the DOS Stub which is next in level to the MZ headers. (depicted in the diagram above.) DOS Stub is an executable that gets executed when the Operating System on which the executable is run doesn’t understand the PE file format and throws messages like:
This program requires DOS
This program cannot run in DOS mode.
Backward compatibility (from Win32 to DOS based systems) is provided by MZ headers. For example, if we run an executable on a DOS based system, the DOS Stub says that it’s a Win32 executable and cannot be run on a DOS based system. Now that we know why MZ header exists in a PE file, let’s have a look at what it contains and the tool that can be used to view the contents of the header.
PE Viewer is an easy to use tool that can help us in getting the MZ and PE headers of the PE files. Since we are more interested in the MZ headers, we will have a detailed look on the MZ Headers section of the tool.
Snapshot below shows the contents of a MZ Header as seen through the PEViewer tool. For case study we have taken C:\Windows\notepad.exe executable and have opened it in PEViewer.
MZ headers of the PE Viewer tool informs the user about the following:
File Size in Bytes: This represents the size of the file in bytes. The value is given in terms of hexadecimal format. Convert the value to decimal format to get the value in bytes.
Bytes on last page: This section provides the number of bytes present in the last page of the executable. The page is generally considered to be of 512 bytes.
Pages in File: This informs the tool user about the total number of 512 byte pages that are present in the executable.
Size of Header: This category denotes the header size in terms of paragraphs.
Minimum Ram Needed: It is the minimum memory allocated to the executable apart from the memory that is allocated to the size of the code. Memory is measured in terms of paragraphs.
Maximum Ram Needed: On similar lines, this represents the maximum memory allocated to the executable in paragraphs apart from the memory allocated to the code size.
Stack Segment: This value represents the initial stack segment that is relative to the starting address of the executable.
Stack Pointer: Represents the initial address of the stack pointer.
Instruction Pointer: Represents the address of the Instruction Pointer relative to the starting address of the executable.
Code Segment: This is the relative address of the code segment in the executable file.
PEViewer can also be invoked from the command prompt by giving the filename as the argument. For example, if you want to open notepad.exe in the tool, you could do that by writing the following command: